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Abstract 



This thesis deals with a series of quantum computer implementation 
issues from the Kane ^^P in ^^Si architecture to Shor's integer factoring 
algorithm and beyond. The discussion begins with simulations of the adi- 
abatic Kane CNOT and readout gates, followed by linear nearest neighbor 
implementations of 5-qubit quantum error correction with and without fast 
measurement. A linear nearest neighbor circuit implementing Shor's algo- 
rithm is presented, then modified to remove the need for exponentially small 
rotation gates. Finally, a method of constructing optimal approximations 
of arbitrary single-qubit fault-tolerant gates is described and applied to the 
specific case of the remaining rotation gates required by Shor's algorithm. 
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1. Introduction 



This chapter begins with a brief review of the quantum computing field with 
a bias towards the specific topics developed in later chapters. The main 
purpose of the review is to introduce the concepts and language required 
to overview the aims and content of this thesis (Section II. 4|) . The reader 
familiar with quantum computing can go directly to Section [1.41 Familiarity 
with quantum mechanics is assumed. 

In Section ll.ll we provide justification and motivation for the study of 
quantum computing. Section 11.21 reviews the various models of quantum 
computation, namely the circuit model, adiabatic quantum computation, 
cluster states, topological quantum computation, and geometric quantum 
computation. The theoretical work demonstrating that arbitrarily large 
quantum computations can be performed arbitrarily reliably is gathered in 
Section fl. HI Section fl .41 then overviews the thesis. 

1.1 Why quantum compute? 

The incredible exponential growth in the number of transistors used in con- 
ventional computers, popularly described as Moore's Law, is expected to 
continue until at least the end of this decade This growth is achieved 
through miniaturization. Smaller transistors consume less power, can be 
packed more densely, and switch faster. However, in some areas, particu- 



2 



1. Introduction 



larly the silicon dioxide insulating layer within each transistor, atomic scales 
are already being approached j^- This fundamental barrier is one of many 
factors fuelling research into radical new computing technologies. 

Furthermore, certain problems simply cannot be solved efficiently on 
conventional computers irrespective of their inexorable technological and 
computational progress. One such problem is the simulation of quantum 
systems. The amount of classical data required to describe a quantum sys- 
tem grows exponentially with the system size. The data storage problem 
alone precludes the existence of an efficient method of simulation on a con- 
ventional computer. 

The first hint of a way around this impasse was provided by Feynman 
in 1982 [2] when he suggested using quantum mechanical components to 
store and manipulate the data describing a quantum system. The number 
of quantum mechanical components required would be directly proportional 
to the size of the quantum system. This idea was built on by Deutsch in 
1985 to form a model of computation called a quantum Turing machine 
— the quantum mechanical equivalent of the universal Turing machine 
which previously was thought to be the most powerful and only model of 
computation. 

That the laws of physics in principle permit the construction of quan- 
tum computers exponentially more powerful than their classical relatives is 
hugely significant. Despite this, it was not until the publication of Shor's 
quantum integer factoring algorithm ^6 that research into quantum comput- 
ing began to attract serious attention. The difficulty of classically factoring 
integers forms the basis of the popular RSA encryption protocol [Z]. RSA 
is used to establish secure connections over the Internet, enabling the trans- 
mission of sensitive data such as passwords, credit card details, and online 
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banking sessions. RSA also forms the heart of the popular secure messaging 
utility PGP (Pretty Good Privacy) [5j. Rightly or wrongly, the prospect of 
rendering much of modern classical communication insecure has arguably 
driven the race to build a quantum computer. 

More recently, quantum algorithm offering an exponential speed up over 
their classical equivalents have been devised for problems within group the- 
ory PI, knot theory jJOj, eigenvalue calculation jT^, image processing |12| . 
basis transformations jl3j . and numerical integrals ^|. Other promising 
quantum algorithms exist |15| I16j. but have not been thoroughly analyzed. 

On the commercial front, the communication of quantum data has been 
shown to enable unconditionally secure communication in principle re- 
sulting in the creation of companies offering real products |18[ ll9j that have 
already found application in the information technology sector [2111 • Despite 
this, it remains to be seen whether human ingenuity is sufficient to make 
large-scale quantum computing a reality. 

1.2 Models of quantum computing 

Classical computers have a single well defined computational model — the 
direct manipulation of bits via Boolean logic. The field of quantum compu- 
tation is too young to have settled on a single model. This section attempts 
to make a brief yet complete review of the current status of the various 
quantum computation models currently under investigation. We neglect 
quantum Turing machines as they are a purely abstract rather than a 
physically realisably computation model. We also neglect quantum neural 
networks j2H \22\ due to their use of nonlinear gates, and both Type II 
quantum computers f23 and quantum cellular automata [H] due to their 
essentially classical nature. 
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1.2.1 The circuit model 



The most widely used model of quantum computation is called the circuit 
model. Instead of the traditional bits of conventional computing which can 
take the values or 1, the circuit model is based on qubits which are quantum 
systems with two states denoted by |0) and |1). The power of quantum 
computing lies in the fact that qubits can be placed in superpositions a|0) + 
/3|1), and entangled with one another, eg. (|00) + |ll))/\/2- Manipulation of 
qubits is performed via quantum gates. An n-qubit gate is a 2" x 2"' unitary 
matrix. The most general single-qubit gate can be written in the form 



U 



( gi(a+/3)/2 (,Qg Q gin ( 



„i(-a+/3)/2 g-j^ g-i(a+/3)/2 



(1.1) 



cos ( 



Common single-qubit gates are 
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For example, the result of applying an X-g&ie to a|0) + /3|1) is 



1 

1 



/ 

a 

u 




(1.2) 



(1.3) 



The X-gate will sometimes be referred to as a not gate or inverter. Its 
action will sometimes be referred to as a bit-flip or inversion, and that of the 
Z-gate as a phase- flip. The H-gaie was derived from the Walsh-Hadamard 
transform |25| I26j and first named as the Hadamard gate in |22j . 
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Given the ability to implement arbitrary single qubit gates and almost 
any multiple qubit gate, arbitrary quantum computations can be performed 
|28[ l29j. The column vector form of an arbitrary 2-qubit state |^') = a|00) + 
/3|01) +7|10) is 



:i.4) 



P 

7 

V ^ J 

Note the ordering of the computational basis states. For convenience, such 
states will occasionally be denoted by \qiqo) with qi (qo) referred to as the 
first (last) or left (right) qubit regardless of the actual physical arrangement. 
The most common 2-qubit gate is the controlled-NOT (cnot) 



/ 



10 
10 
1 
10 



:i.5) 



which given an arbitrary 2-qubit state \qiqo), inverts the target qubit go if 
the control qubit qi is 1. Note that a cnot with target qubit qi and control 
qubit go would have the form 



/ 1 ^ 

1 

10 

\ 1 y 



:i.6) 
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(a) ^ 

-{s}- ^ -m- 



(b) 



(c) 



Fig. 1.1: (a) Common single-qubit gates, (b) the CNOT gate with soHd dot repre- 
senting the control qubit, (c) the swap gate. 



The swap gate 
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(1.7) 



swaps the states of qi and qq. Additional 2-qubit gates will be defined as 
required. 

By representing qubits as horizontal lines, a time sequence of quantum 
gates can conveniently be represented by notation that looks like a conven- 
tional circuit. Symbols equivalent to the gates described above are shown in 
Fig. 11.11 An example of a complete circuit is shown in Fig. 11.2b .. Note that 
the horizontal lines represent time flowing from left to right, not wires. We 
define the depth of a quantum circuit to be the number of layers of 2-qubit 
gates required to implement it. Note that multiple single-qubit gates and 
2-qubit gates applied to the same two qubits can be combined into a single 
2-qubit gate. For example, Fig. 11.2b is a depth 6 rearrangement of Fig. 11.2b . 
These circuits are discussed in more detail in Chapter |S1 

The basis {|0), |1)} is referred to as the computational basis. The sim- 
plest representation of an n-qubit state is 

1^) = i E 1^)' (1-8) 
V ^ k=l 

where k is an n-bit number. The information theoretic properties of qubits 
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(a) I'P)- 



(b) 
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If.) 
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|o> 
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3 



Y 



Fig. 1.2; (a) Example of a quantum circuit, (b) depth 6 rearrangement of (a). This 
circuit implements the encode stage of 5-qubit non-fault-tolerant quantum 
error correction. 



have undergone investigation by many authors and are weh reviewed in [HU] . 

One natural extension of the qubit circuit model is to use d-level quantum 
systems (qudits) . The simplest representation of an n-qudit state is 

1^) = E 1^)' (1-9) 

V " fc=l 

where k is expressed in base d. The properties of qudit entanglement |31j . 
qudit teleportation [22] , qudit error correction jSHl 0^ ■, qudit cryptography 
|351 136j , and qudit algorithms [371 138j are broadly similar to the correspond- 
ing properties of qubits, and will not be discussed further here. 

A third variant of the circuit model exists based on continuous quan- 
tum variables such as the position and momentum eigenstates of photons. 
Considerable experimental work on the entanglement of continuous quan- 
tum variables has been performed and is reviewed in Ref. ^9j|. In particular, 
the problem of continuous quantum variable teleportation 40^ has received a 
great deal of attention. Furthermore, methods have been devised to perform 
continuous variable error correction [IJ, cryptography [l^]) and continuous 
variable versions of a number of algorithms |431 144[ I45| I46j . 
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1.2.2 Adiahatic quantum computation 

In the adiabatic model of quantum computation a Hamiltonian Hj 
is found such that its ground state is the solution of the problem under 
consideration. This final Hamiltonian Hf must also be continuously de- 
formable (for example, by varying the strength of a magnetic field) to some 
initial Hamiltonian Hi with a ground state that is easy to prepare. After 
initializing the computer in the ground state of Hi, Hi is adiabatically (i.e. 
sufficiently slowly to leave the system in the ground state) deformed into Hj 
yielding the ground state of Hj. The major difficulty is determining how 
slowly the deformation must occur to be adiabatic. The adiabatic model 
can be applied to systems of qubits, qudits, and continuous quantum vari- 
ables, and is equivalent to the circuit model in the sense that any adiabatic 
algorithm can be converted into a quantum circuit with at most polyno- 
mial overhead [481 149j . The principal advantage of the adiabatic model is 
an alternative way of thinking about quantum computation that has led to 
numerous new algorithms tackling problems in graph theory [SO], combina- 
torics 1^ , condensed matter and nuclear physics j52| , and set theory 

m- 

1.2.3 Cluster states 

Given a multi-dimensional lattice of qubits each initialized to (|0) + |l))/\/2 
with identical tunable nearest neighbor interactions of form 

H,,{t) = hgm + a«)(l + cTp))/4, (1.10) 

a cluster state ^S? can be created by evolving the system for a time such that 
/ g{t)dt = TT. Cluster states are highly entangled states with the remarkable 
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property that arbitrary quantum computations can be performed purely via 
single-qubit measurements along arbitrary axes [551 156j . A small amount of 
classical computation is required between measurements. Cluster states can 
also be defined over qudits [STj, and are special cases of the more general 
class of graph states ^58;. Methods have been devised to perform quantum 
communication [HHl , error correction ^60^ and fault-tolerant computation [HJ 
within the cluster state model. Entanglement purification can be used to 
increase the reliability of cluster state generation . Unlike the adiabatic 
model however, the cluster state model is yet to lead to any genuinely new 
algorithms. Given the equivalence of the cluster state model to the circuit 
model the primary utility of the cluster state model appears to be 
simpler physical implementation in certain systems such as linear optics 
inHinS]) and possibly special cases of NV-centers in diamond, quantum dots, 
and ion traps |66] . 



1.2.4 Topological quantum computation 

The primary difficulty in building a quantum computer is controlling data 
degradation through interaction with the environment, generally called deco- 
herence. Interaction with the environment can in principle be eliminated by 
using a topological model of computation. Topological quantum computa- 
tion was proposed by Kitaev in 1997 67^, and developed further in Ref. |68] . 
An alternative proposal was given by Freedman in Ref. lOHI- Kitaev consid- 
ered an oriented 2-dimensional lattice of hypothetical particles with many 
body interactions as shown in Fig. 11.31 The hypothetical particles on each 
lattice link are related to the 60 permutations of five distinguishable objects 
P5. Certain types of excitations of the lattice are also related to the group 
P5, and exist on lattice sites which correspond to a vertex and face pair 
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Fig. 1.3: Lattice of hypothetical particles used in the construction of the anyonic 
model of quantum computation. 

as shown in Fig. 11.41 These excitations are called non-Abelian anyons. Of 
particular interest are pairs of excitations \g,g^^)- Non-Abelian anyon pairs 
have the remarkable property that simply moving them through one an- 
other effects computation. Given two pairs \g, g~^), \h, h~^), moving g~^) 
through \h,h~^) as shown in Fig. 11.51 creates the state \hgh^^ , hg^^ h^^) . 

Let |0) = \g,g^^) and |1) = \h,h^^). Find x such that h = xgx^^. 
A quantum inverter can be constructed simply by moving states |0),|1) 
through the ancilla pair \x,x~^). More complicated operations can be per- 
formed simply in a similar manner, using more ancilla pairs and pull through 
operations. 

At sufficiently low temperature, the only way data can be corrupted in 
a topological quantum computer is via the spontaneous interaction of two 
anyons — an event that occurs with probability 0(e~"'), where / is the 
minimum separation of any pair of anyons. This probability can be made 
arbitrarily small simply by keeping the anyons well separated. While this is 
a desirable property, clearly a computational model based on hypothetical 
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Fig. 1.4: A conjugate pair of non-Abelian anyons. 
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Fig. 1.5: Performing operations in the anyonic model of quantum computation. 
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particles that do not exist in nature cannot be implemented. 

Recently, significant progress has been made on the topological quantum 
computation model. General schemes using anyons based on arbitrary non- 
solvable groups [Zn|; and smaller, solvable, non-nilpotent groups have 
been devised. A less robust but simpler scheme based on Abelian anyons 
has been proposed [72] and designs for possible experimental realizations 
are emerging |73| I74j . Experimental proposals based on non- Abelian anyons 
have also been constructed [7S1 [7H| . 

1.2.5 Geometric quantum computation 

The basic idea of geometric quantum computation is illustrated by the 
Aharonov-Bohm effect |77i in which a particle of charge q executing a loop 
around a perfectly insulated solenoid containing flux <I> acquires a geomet- 
ric phase e*"?* [78 . As shown in Fig. 11.61 the phase acquired is insensitive 
to the exact path taken. This phase shift can be used to build quantum 
gates |79| 180). At the moment, there are conflicting numerical and analytic 
calculations both supporting and attacking the fundamental robustness of 
geometric quantum computation j81j . 

1.3 Quantum computation is possible in principle 

For the remainder of the thesis we will focus on the qubit circuit model. 
Physical realizations of this model of computation have been proposed in 
the context of liquid NMR |S2] , ion traps W including optically [HH ES] 
and physically (HHi EZj coupled microtraps, linear optics |HH]) Josephson 
junctions utilizing both charge jBHl IHIl and flux [Hlj degrees of freedom, 
quantum dots 92 , 

31p in 28si 

architectures utilizing indirectly exchange 
coupled donor nuclear spins [HSl , exchange coupled electron spins [23 ESI j 
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Fig. 1.6: Example of the path independence of the phase shift induced by the 
Aharonov-Bohni effect. A particle of charge q executing a loop around a 
perfectly insulated solenoid containing flux <!> acquires a geometric phase 

both spins [5^11211) magnetic dipolar coupled electron spins jHE] and qubits 
encoded in the charge distribution of a single electron on two donors |99j . 
deep donors in silicon |lUUj . acceptors in silicon jlUlj . solid-state ensemble 
NMR utilizing lines of 29Si in 28Si |lU2j . electrons floating on liquid helium 
HHni, cavity QED [UHllinH], optical lattices [Ml HEl and the quantum 
Hall effect |1U8| . At the present time, the ion trap approach is the closest to 
realizing the five basic requirements of scalable quantum computation |in9j . 

The mere existence of quantum computer proposals and quantum algo- 
rithms is not sufficient to say that quantum computation is possible in prin- 
ciple. Firstly, almost all quantum algorithms that provide an exponential 
speed up over the best-known equivalent classical algorithms make use of the 
quantum Fourier transform which in turn uses exponentially small rotations. 
For example, in the case of Shor's algorithm, to factor an L-bit number, in 
principle single-qubit rotations of magnitude 27r/2^^ are required. Clearly 
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this is impossible for large L. 

Coppersmith resolved this issue by showing that most of the small ro- 
tations in the quantum Fourier transform could simply be ignored without 
significantly affecting the output of the circuit. For the specific case of 
Shor's algorithm, as described in Chapter 121 we took this work further to 
show that rotations of magnitude 7r/128 implemented with accuracy ib7r/512 
were sufficient to factor integers thousands of bits long. 

More seriously, quantum systems are inherently fragile. The gamut of 
relaxation processes, environmental couplings, and even systematic errors in- 
duced by architectural imperfections are typically grouped under the head- 
ing of decoherence. Ignoring leakage errors in which a qubit is destroyed 
or placed in a state other than those selected for computation jllUj . Shor 
made the surprising discovery that all types of decoherence could be cor- 
rected simply by correcting unwanted bit-flips (X), phase-flips (Z) and both 
simultaneously {XZ) llllj . Shor's scheme required each qubit of data to 
be encoded across nine physical qubits. This was the first quantum error 
correction code (QECC). 

Later work by Laflamme gave rise to a 5-qubit QECC |112j — the small- 
est code that can correct an arbitrary error to one of its qubits jHO]- Steane's 
7-qubit code |113j . which also only guarantees to correct an arbitrary error 
to one of its qubits, is, however, more convenient for the purposes of quan- 
tum computation. Steane's code is part of the large class of CSS codes 
(Calderbank, Shor, Steane) |114j . which is in turn part of the very large 
class of stabilizer codes jll5j . Only a few examples such as permutationally 
invariant codes |116j exists outside the class of stabilizer codes. 

To illustrate the utility of the stabilizer formalism, consider a 5-qubit 
QECC |117j in which we create logical |0) and |1) states corresponding to 
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the superpositions 

|0l) = looooo) + llOOlO) + lOlOOl) + |10100) 
+101010) - IllOll) - 100110) - 111000) 
-|11101) - lOOOll) - IllllO) - 101111) 
-|10001) - lOllOO) - llOlll) + lOOlOl), (1.11) 
= 111111) + 101101) + 110110) + 101011) 

+110101) - looloo) - iiiool) - loom) 

-|00010) - IlllOO) - lOOOOl) - iioooo) 

-|01110) - 110011) - 101000) + 111010). (1.12) 

These superstitions redundantly encode data in such a way that an arbitrary 
error to any one qubit can be corrected. They are, however, difficult to work 
with directly. In the stabilizer formalism, \0l) and are instead described 
as simultaneous +1 eigenstates of 

Ml = X^Z^Z^X^I (1.13) 

M2 = I®X®Z®Z®X (1.14) 

M3 = X®I®X®Z®Z (1.15) 

M4 = Z®X®I®X®Z (1.16) 

These operators are called stabilizers. Let S denote the set of stabilizers 
(which is the group generated by M1-M4). Any valid logical state must 
satisfy M|*l) = |*l) for ah M e S. This observation allows us to de- 
termine which quantum gates can be applied directly to the logical states. 
Specifically, if we wish to apply U to our logical state to obtain U\"^l), then 



16 



1. Introduction 



Since UMWUI"^) must be a logical state, UMW must be a stabilizer. If 
we restrict our attention to gates U that are products of single-qubit gates, 
only two single logical qubit gates 



can be applied directly to data encoded using this version of 5-qubit QEC. 
Note that by restricting our attention to products of single-qubit gates, any 
error present in one of the qubits of the code before a logical gate operation 
cannot be copied to other qubits. Any circuit with the property that a single 
error can cause that most one error in the output is called fault-tolerant. 

So far, we have not explained how errors are corrected using a QECC. 
Given a potentially erroneous state l^*'), one way of locating any errors is 
to check whether |^'') is still a eigenstate of each of the stabilizers. Any 
errors so located can then be manually corrected. This method is described 
in some detail in Chapter 1101 

We now have basic single logical qubit gates and error correction. To 
achieve universal quantum computation we need to be able to couple logical 
qubits and perform arbitrary single logical qubits gates [2H1 US] • Unfortu- 
nately, the 5-qubit QECC does not readily permit multiple logical qubits to 
be coupled, though a complicated three logical qubit gate does exist jll8| . 
For universal quantum computation, the 7-qubit Steane code, or indeed any 
of the CSS codes, is more appropriate as they permit a simple transversal 
implementation of logical CNOT as shown in Fig. 11.71 The 7-qubit Steane 
code also permits similar transversal single-qubit gates H, X, Z, S and 

(see Fig. 110. 2j) . These are, however, insufficient to construct arbitrary 



Xl = X®X®X®X®X 



(1.17) 



Zl = z®z®z®z®z 



(1.18) 
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Fig. 1.7: 7-qubit transversal logical CNOT gate. 



single-qubit rotations of the form 



cos(0/2)e^("+^)/2 sin(e/2)e^("-^)/2 
-sin(e/2)e*(-°+^)/2 cos(e/2)e*(-"-'3)/2 



(1.19) 



An additional gate such as the T-gate is required to construct arbitrary 
single-qubit gates. The simplest fault-tolerant implementation of the T-gate 
we have been able to devise still requires an additional 12 ancilla qubits, and 
at least 93 gates, 45 resets and 17 measurements arranged on a circuit of 
depth at least 92 and is described in detail in Chapter 1101 By virtue of the 
fact that the operation HT corresponds to the rotation of a qubit by an 
angle that is an irrational number, and the fact that repeated rotation by 
an irrational number enables arbitrarily close approximation of a rotation 
of any angle, the pair of single-qubit gates H, T is sufficient to approximate 
an arbitrary single-qubit rotation arbitrarily accurately. In practice, it is 
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better to use the 23 unique combinations of H, X, Z, S and in addition 
to the T-gate since, for example, the logical X-gate is vastly simpler to 
implement than HTTTTH . The existence of efficient sequences of gates to 
approximate arbitrary unitary rotations in general, and single-qubit gates in 
particular, is guaranteed by the Solovay-Kitaev theorem |1191ll2t]ll3Ul[T^ . 
The exact length of such sequences required to achieve a given accuracy is 
discussed in Chapter [TUl 

We now have enough machinery to consider arbitrarily large, arbitrarily 
reliable quantum computation. Suppose every qubit in our computer has 
probability p of suffering an error per unit time, where unit time refers to 
the amount of time required to implement the slowest fundamental gate 
(not logical gate). Consider a circuit consisting of the most complicated 
fault-tolerant logic gate, the T-gate, followed by quantum error correction. 
By virtue of being fault-tolerant, this circuit can only fail to produce useful 
output if at least two errors occur. For some constant c, and sufficiently 
low error rate p, the probability of failure of the structure is thus cp^. Since 
we have chosen the most complicated gate, every other fault-tolerant gate 
followed by error correction, including the identity / (do nothing) gate, has 
probability of failure at most cp^. 

Consider an arbitrary quantum circuit expressed in terms of the funda- 
mental gates CNOT, /, H, X, Z, S, , their products, and T (no QEC at 
this stage). In the worst-case, if an error anywhere causes the circuit to fail, 
the probability of success of such a circuit is (1 — pY^ where q is the number 
of qubits and t the number of time steps. By replacing each gate with an 
error corrected fault-tolerant structure, this can be reduced to (1 — cp^Y^ . 
Note that the new circuit is still expressed entirely in terms of the allowed 
gates and measurement. We can therefore repeat the process and replace 



1.3. Quantum computation is possible in principle 



19 



these gates in turn with error corrected fault-tolerant structures giving an 
overall reliability of (1 — c^p^)"?*. If we repeat this k times we find that the 
overall circuit reliability is 



Clearly, provided the error rate per time step is less than pt^ = 1/c and we 
have sufficient resources, an arbitrarily large quantum computation can be 
performed arbitrarily reliably. This is the threshold theorem of quantum 
computation jl22j . Note that the greater the amount p < pt^, the fewer 
levels of error correction that are required. Despite the lack of a definitive 
reference, pth is frequently assumed to be 10""^. 

Substantial customization and optimization of the threshold theorem has 
occurred over the years. As described, the threshold theorem requires the 
ability to interact arbitrarily distant pairs of qubits in parallel, fast measure- 
ment gates, and fast and reliable classical computation. A detailed study of 
the impact of noisy long-range communication has been performed yielding 
threshold error rates ~10^'* for a variety of assumptions jl23j . Modifica- 
tions removing the need for measurement and classical processing but still 
requiring error-free long-range interactions have been devised at the cost of 
greatly increased resources jl24j . though subsequent work has substantially 
reduce the complexity of the required quantum circuitry with a threshold 
Pt/i = 2.4 X 10^^ obtained under the additional assumption that the quantum 
computer consists of a single line qubits with nearest neighbor interactions 
only [T25] . 

By using a much simplified error correction scheme devised by Steane 
|126j that works on any CSS code, and using larger QECCs and less concate- 




(1.20) 
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nation, encouragingly high thresholds ~10~^ jl27j and even 9 x 10^^ jl28j 
have been calculated, although in the latter case only under the assumption 
that errors occur after gates, not to idle qubits, and in both cases long-range 
interactions must be available. 

An alternative approach is to perform computation by interacting data 
with specially prepared ancilla states |129|ll3flj . a method called postselected 
quantum computing. While the resources required to prepare sufficiently 
reliable ancilla states are prohibitive, in principle this approach permits 
arbitrarily large computations to be performed provided p < pth = 0.03 

una. 

1.4 Overview 

The primary goal of this thesis is to relax the theoretical resource require- 
ments required for large-scale quantum computation. Much of our work is 
motivated by the Kane 3ip in 28Si architecture in which fast and reliable 
measurement and classical processing is difhcult to achieve, and qubits may 
be limited to a single line with nearest neighbor interactions only. 

In Chapters|2HSl the viability of the adiabatic Kane cnot gate and read- 
out operation are assessed. Chapter |1] reviews a technique of constructing 
efficient 2-qubit gates. A greatly simplified non- fault-tolerant linear nearest 
neighbor implementation of 5-qubit quantum error correction is presented 
in Chapter El and further modified to remove the need for measurement and 
classical processing in Chapter ^ Chapter [7| provides a detailed review of 
Shor's algorithm, followed by a linear nearest neighbor circuit implementa- 
tion in Chapter |H1 Chapter El focuses on removing the need for exponen- 
tially small rotations in circuit implementations of Shor's algorithm. Finally, 
Chapter ^1 presents a method of constructing arbitrary single-qubit fault- 
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tolerant gates, and applies this to the specific case of the remaining rotation 
gates required by Shor's algorithm. Chapter ^2 contains concluding remarks 
and summarizes the results of the thesis. Chapters 121 Ol El El and El have 
been published in USl (1331 UMl USSl USHl respectively. 
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2. The adiabatic Kane CNOT gate 



The spins of the P nucleus and donor electron in bulk Si have extremely 
long coherence times jl,S7( I188j . Consequently, a number of quantum com- 
puter proposals have been based on this system [321 EZl HH HH] • In this 
chapter and Chapter we focus on Kane's 1998 proposal [HSl which calls 
for a line of single ^^P atoms spaced approximately 20nm apart ji;-i9| 114(1] . 
The spin of each phosphorus nucleus is used as a qubit and the donor elec- 
tron used to couple to neighboring qubits. Neglecting readout mechanisms 
for the moment, each qubit requires at least two electrodes to achieve single- 
and 2-qubit gates. The extent to which the presence and operation of these 
electrodes will reduce the system's coherence times is unknown. We there- 
fore study the error rate of the Kane cnot gate as a function of the coherence 
times to determine their approximate minimum acceptable values. We find 
that the coherence times required to achieve a cnot error rate of lO"'' are 
a factor of 6 less than those already observed experimentally. 

The chapter is organized as follows. In Section 12.11 the Kane architec- 
ture is briefly described followed by the method of performing a cnot in 
Section [2.21 In Section f2.. SI the technique we used to model finite coherence 
times is presented along with contour plots of the cnot error rate as a func- 
tion of the coherence times. In Section 12.41 we discuss the implications of 
our results. 
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2.1 The Kane architecture 

Ignoring readout mechanisms, which we discuss in Chapter |21 the basic lay- 
out of the adiabatic Kane phosphorus in sihcon sohd-state quantum com- 
puter is shown in Fig. 12. II The phosphorous donor electrons are used 
primarily to mediate interactions between neighboring nuclear spin qubits. 
As such, the donor electrons are polarized to remove their spin degree of 
freedom. This can be achieved by maintaining a steady = 2T at around 
T = lOOmK jl41j . Techniques for relaxing the high field and low tempera- 
ture requirements such as spin refrigeration are under investigation |96j . 

In addition to the potential to build on the vast expertise acquired dur- 
ing the last 50 years of silicon integrated circuit development, the primary 
attraction to the Kane architecture, and 

31p 28si 

architectures in general, 

is their extraordinarily long spin coherence times jl,S7| . Four quantities 
are of interest — the relaxation (Ti) and dephasing (T2) times of both the 
donor electron and nucleus. Both times only have meaning when the system 
is in a steady magnetic field. Assuming the field is parallel with the z-axis, 
the relaxation time refers to the time taken for 1/e of the spins in the sample 
to spontaneously flip whereas the dephasing time refers to the time taken for 
the X and y components of a single spin to decay by a factor of 1/e. Exist- 
ing experiments cannot measure T2 directly, but instead a third quantity 
which is the time taken for the x and y components of an ensemble of spins 
to decay by a factor of 1/e. Since T2 < T2 |142j . we can use experimental 
values Tn lower bound for T2. 

In natural silicon containing 4.7% ^^Si, relaxation times Ti in excess 1 
hour have been observed for the donor electron at T = 1.25K and B ~ 0.3T 
|137j . The nuclear relaxation time has been estimated at over 80 hours in 
similar conditions |143j . These times are so long that we will ignore relax- 
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ation in our simulations of gate reliability. The donor electron dephasing 
time T| in enriched ^^Si containing less than 50ppm ^^Si, at 7K and donor 
concentration 0.8 x 10^^ has been measured to be 14ms |138j . Extrapo- 
lation to a single donor suggests T2 = 60ms at 7K. An even longer T2 is 
expected at lower temperatures. At the time of writing, to the authors' 
knowledge, no experimental data relating to the nuclear dephasing time has 
been obtained. However, considering the much greater isolation from the 
environment of the nuclear spin, this time is expected to be much larger 
than the electron dephasing time. 



Electrodes 




T~100mK 



Si02 barrier 




Fig. 2.1: Schematic of the Kane architecture. The rightmost two qubits show the 
notation to be used when discussing the CNOT gate. 

Control of the nuclear spin qubits is achieved via electrodes above and 
between each phosphorus atom and global transverse oscillating fields of 
magnitude ~10~^T. To selectively manipulate a single qubit, the ^-electrode 
above it is biased. A positive/negative bias draws/drives the donor electron 
away from the nucleus, in both cases reducing the magnitude of the hyperfine 
interaction. This in turn reduces the energy difference between nuclear spin 
up (|0)) and down (|1)) allowing this transition to be brought into resonance 
with a globally applied oscillating magnetic field. Depending on the timing 
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of the ^-electrode bias, an arbitrary rotation about an axis in the x-y plane 
can be implemented |144j . By utilizing up to three such rotations, a qubit 
can be rotated into an arbitrary superposition a\0) + 

Biasing an A-electrode above a particular donor will also effect neigh- 
boring and more distant donors. It is likely that compensatory biasing of 
nearby electrodes will be required to ensure non-targeted qubits remain off 
resonant. Subject to this restriction, there is no limit to the number of 
simultaneous single-qubit rotations that can be implemented throughout a 
Kane quantum computer. 

Interactions between neighboring qubits are governed by the J-electrodes. 
A positive bias encourages greater overlap of the donor electron wave func- 
tions leading to indirect coupling of their associated nuclei. In analogy to 
the single-qubit case, this allows multiple 2-qubit gates to be performed se- 
lectively between arbitrary neighbors. A discussion of the electrode pulses 
required to implement a cnot is given in the next section. 

2.2 Adiabatic cnot pulse profiles 

Performing a CNOT gate on an adiabatic Kane QC is an involved process 
described in detail in |141j . Given the high field (2T) and low temperature 
(lOOmK) operating conditions, we can model the behavior of the system 
with a spin Hamiltonian. Only two qubits are required to perform a cnot, 
so for the remainder of the chapter we will restrict our attention to a com- 
puter with just two qubits. The basic notation is shown in the right half 
of Fig. Em Furthermore, let a'^i = ® I ® I ® I , ali = I ® a'' ® I ® /, 
(T^2 = I(^I'^(y^®I and al2 = I I ® I cr^ where / is the 2x2 identity 
matrix, is the usual Pauli matrix and ® denotes the matrix outer prod- 
uct. With these definitions the meaning of terms such as ^"^^ ^ei should 
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be self evident. 

Let Qn be the g-factor for the phosphorus nucleus, [in the nuclear magne- 
ton and [IB the Bohr magneton. The Hamiltonian can be broken into three 
parts 

H = Hz + Hi^t{t) + H,c{t). (2.1) 
The Zeeman energy terms are contained in Hz 

Hz = -5n/^nS.«l + <2) + /^B^.(^el + <2)- (2-2) 

The contact hyperfine and exchange interaction terms, both of which can 
be modified via the electrode potentials are 

Hir,t{t) = Ai{t)anl ■ <7el + ^2(i)o'„2 • ^e2 + J{t)^el ' ^e2, (2.3) 

where Aj(t) = 87r/iB(7„^„|<I>j(0, t)p/3, |<I>j(0,t)| is the magnitude of the wave- 
function of donor electron i at phosphorous nucleus i at time t, and J{t) 
depends on the overlap of the two donor electron wave functions. The depen- 
dance of these quantities on their associated electrode voltages is a subject 
of ongoing research jl45| I146| I147| I148| I14()j , though it appears atomic 
precision placement of the phosphorus donors is required due to strong os- 
cillatory dependence of the exchange interaction strength on the distance 
and direction of separation of donors. For our purposes, it is sufficient to ig- 
nore the exact voltage required and assume that the hyperfine and exchange 
interaction energies Ai and J are directly manipulable. 

The last part of the Hamiltonian contains the coupling to a globally 
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applied oscillating field of magnitude Ba^(t). 

(2.4) 

Using the above definitions, only the quantities Ai, J and i?ac need to be 
manipulated to perform a CNOT gate. 

For clarity, assume the computer is initially in one of the states |00), 
|01), |10) or |11) and that we wish to perform a CNOT gate with qubit 1 (the 
left qubit) as the control. The necessary profiles are shown in Fig. 12.21 Step 
one is to break the degeneracy of the two qubits' energy levels to allow the 
control and target qubits to be distinguished. To make qubit 1 the control, 
the value of Ai is increased (qubit 1 will be assumed to be the control qubit 
for the remainder of the chapter). 

Step two is to gradually apply a positive potential to the J-electrode 
in order to force greater overlap of the donor electron wave functions and 
hence greater (indirect) coupling of the underlying nuclear qubits. The rate 
of this change is limited so as to be adiabatic — qubits initially in energy 
eigenstates remain in energy eigenstates throughout this step. This point 
shall be discussed in more detail shortly. 

Let |symm) and |anti) denote the standard symmetric and antisymmetric 
superpositions of |10) and |01). Step three is to adiabatically reduce the Ai 
coupling back to its initial value once more. During this step, anti-level- 
crossing behavior changes the input states |10) — > |symm) and |01) — > |anti). 

Step four is the application of an oscillating field Sac resonant with the 
|symm) ^ |11) transition. This oscillating field is maintained until these 
two states have been interchanged. Steps five to seven are the time reverse 
of steps one to three. Note that steps one and seven (the increasing and 
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decreasing of Ai ) appear instantaneous in Fig. 12.21 as the only limit to their 
speed is that they be done in a time much greater than ?i/0.01eV~ O.lps 
where O.OleV is the orbital excitation energy of the donor electron. 

In principle, the adiabatic steps should be performed slowly to achieve 
maximum fidelity. In practice, slow gates are more vulnerable to decoher- 
ence. To resolve this conflict, consider the degree to which the evolution of 
a given H{t) deviates from perfect adiabaticity ^ 



@{t) = Maxa^b 



h\{Mt)\i{HmMm 



{{Mt)\H{t)\Mt)) - (MmmMtw 



(2.5) 



Ignoring decoherence for the moment, for high fidelity it is necessary that 
@{t) <^ 1. The states \tpa{t)) are the eigenstates of H{t). To a certain 
extent, it is possible to reduce 0(t) without increasing the duration of a 
step by optimizing the profiles of the adiabatically varying parameters in 
H{t). In the case of the adiabatic Kane cnot, this means optimizing the 
profiles of Ai{t) and J{t). 

Various profiles for the adiabatic steps in the cnot procedure have been 
investigated in |15Uj . In Fig. 12.31 we have plotted three possible J{t) profiles 
for step two of the cnot gate. The function Q{t) for each profile is shown 
in Fig. 12.41 Profile 1 is a simple linear pulse. Profile 2 can be seen to be 
the best of the three and is described by J[t) = 810a(l — sech(5t/T)) where 
r = O^s is the duration of the pulse and a = 1.0366 is a factor introduced 
to ensure that J(t) = 810. The third profile 

J^a. f (1+^/2) 0<t<r/(l + W2) 

J{t) = <( 2 r ' ^2.6) 

V[l + sin(fi^I^i±^)], r/(l + ^/2)<t<r 
is a composite linear-sinusoidal profile that was used in the calculations pre- 
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Step 1 2 3 



5 6 7 



Ai(t) 1.703 
1.683 

J(t) 810 



Bac(t) 10 
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Fig. 2.2: Gate profiles and state energies during a cnot gate in units of gnl^nBz 
7.1 X IQ-^meV. 
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Fig. 2.3: Possible forms of the J{t) profile for step 2 of the adiabatic cnot gate. 
J{t) is in units of QnlJ-nSz = 7.1 x 10~^meV. 

sented in this chapter due to numerical difficulties in solving the Schrodinger 
equation for profile 2. The advantage of the second two profiles over the lin- 
ear one is that they flatten out as J approaches 810. At J = 816.65, the 
system undergoes a level crossing. To maintain adiabatic evolution, J{t) 
needs to change more slowly near this value. Note that the reason it is 
desirable to make J{t) so large is to ensure that there is a large energy dif- 
ference between |symm) and |anti) during step 4 (the application of -Bac)- 
This difference is given by 



Without a large energy difference, the oscillating field B^^ which is set to 
resonate with the transition |symm) ^ |11) will also be very close to res- 
onant with |anti) <-> |11) causing a large error during the operation of the 
CNOT gate. This source of error can be further reduced by using a weaker 
Sac at the cost of slower gate operation. 

Step 3 (the decreasing of ^i) could be performed without degrading the 



5E = 2A^ 



1 



1 



) 



(2.7) 



IJ'BBz + gnlJ-nBz IJ-bBz + QnlJ^uBz — 2 J 
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Fig. 2.4: The adiabatic measure 9(t) for each J{t) profile. 

overall fidelity of the gate in a time of less than a micro-second with a linear 
pulse profile. 

The above steps were simulated using an adaptive Runge-Kutta routine 
to solve the density matrix form of the Schrodinger equation 

m = ^mt),p{t)] (2.8) 

in the computational basis |niein2e2). The times used for each stage were 
as follows 



stage 


duration (^s) 


2 


9.0000 


3 


0.1400 


4 


7.5989 


5 


9.0000 


6 


0.1400 



Note that the precision of the duration of stage 4 is required as the 
oscillating field i?ac induces the states |11) and |symm) to swap smoothly 
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back and forth. The duration 7.5989/x,s is the time required for one swap at 

Sac = lO-^T. 

The other step times were obtained by first setting them to arbitrary 
values (~5/is) and increasing them until the gate fidelity ceased to increase. 
The step times were then decreased one by one until the fidelity started 
to decrease. As such, the above times are the minimum time in which the 
maximum fidelity can be achieved. This maximum fidelity was found to be 
5 X 10^^ for all computation basis states. 



In this chapter and Chapter El dephasing is modelled as exponential decay 
of the off diagonal components of the density matrix. While a large variety 
of more detailed dephasing models exist |15H I152| I153| I154| 11551 1156j , the 
chosen method is consistent with the observed experimental behavior of 
dephasing in solid-state systems |142| . The donor electrons and phosphorous 
nuclei are assumed to dephase at independent rates. With the inclusion of 
dephasing terms, Eq. (|2.8j) becomes 



To understand the effect of each double commutator, it is instructive to 
consider the following simple mathematical example : 



2.3 Intrinsic dephasing and fidelity 



P = 




-re[<,[<,p]]-re[<,[<,p]] 



(2.9) 



M = -^[(J^[(J^M]] 
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y 77121 (t) "1,22(0 



y m2i(0)e 



-Art 



"122(0) 



Thus each double commutator in Eq. H2.9() exponentially decays its asso- 
ciated off diagonal elements with characteristic time Tg = l/^T^ or r„ = 



For each initial state |00), |01), |10) and |11), Eq. 1)2. 9|) was solved for a 
range of values of Te and r„ using the pulse profiles described in Section 12.21 
allowing a contour plot of the gate error versus Tg and to be constructed 
(Figs I2.5H2.6|) . Note that each contour is a double line as each run of the 
simulation required considerable computational time and the data available 
does not allow finer delineation of exactly where each contour is. The worst 
case error of all input states as a function of Te and is shown in Fig. 12.71 



Fig. 12.71 suggests that it would be acceptable for the dephasing times of the 
phosphorus donor electron and nuclei to be 10ms and 0.5s respectively if a 
CNOT reliability of 10^^ was desired. Given that the current best estimate 
of the donor electron dephasing time is 60ms 138 , and assuming that the 
nuclear dephasing time is at least a factor of 80 longer again, as for the case 
of the relaxation times, it would be acceptable for the presence of the silicon 
dioxide barrier, gate electrodes, and other control structures to reduce the 
dephasing times by a factor of 6 without impacting on the desired reliability 
of the gate. 

Since the publication of this work |132j . simpler and faster methods of 
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Fig. 2.5: Probability of error e during a CNOT gate as a function of Tg and r„ for 
input state (a) |00) and (b) |01). The first qubit is the control. 
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Fig. 2.6: Probability of error e during a cnot gate as a function of and r„ for 
input state (a) |10) and (b) |11). The first qubit is the control. 
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Fig. 2. 7: The worst case probability of error e during a CNOT gate as a function of 
Te and r„ for all input states. 

implementing gates in the Kane architecture have been devised jl44j , though 
they are slightly more vulnerable to decoherence |157j . An interesting avenue 
of further work would be to similarly analyze 2-qubit gates in the context 
of three electron spin encoded qubits which enable arbitrary computations 
to be performed utilizing the exchange interaction only jl58j . thereby elim- 
inating the need for oscillating magnetic fields and resulting in much faster 
gates. 
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3. Adiabatic Kane single-spin readout 



Meaningful quantum computation can not occur without qubit readout. In 
this chapter, we assess the viabihty of the adiabatic Kane single-spin read- 
out proposal j93j when dephasing and other effects such as finite exchange 
coupling control are taken into account. We find that there are serious bar- 
riers to the implementation of the proposal, and briefly review alternatives 
under active investigation. 

The evolution of the hyperfine and exchange interaction strengths re- 
quired to implement readout is reviewed in Section 13.11 The performance 
of the scheme is described in Section 13.21 Section 13.31 summaries our results 
and points to alternative approaches to readout. 



The geometry of the adiabatic Kane readout proposal is shown in Fig. 13.11 
The basic idea is to raise Ai to distinguish the qubits, and apply appropriate 
voltages to induce the evolution of the hyperfine and exchange interaction 
strengths as shown in Fig. 13.21 This evolution passes through a level crossing 
resulting in the conversion of states |141| 



3.1 Adiabatic readout pulse profiles 







(3.2) 
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Fig. 3.1: Geometry of adiabatic Kane readout. 



|U>|01> - |ae)|ll>, 
|U)|00) ^ \ae)\an), 



(3.3) 
(3.4) 



where ||) denotes a spin-down electron, \ae) denotes the antisymmetric su- 
perposition of the two electrons, and similarly for and |a„). Note that 
if qubit 1 is in state |1) (|0)) the final electron state will be (|ae)). 

By converting the nuclear spin information into electron spin information 
in this manner, in principle we can apply a potential difference to the Ai and 
A2-electrodes and, by virtue of the Pauli exclusion principle, use the SET 
(single electron transistor) to observe tunnelling of electron 1 onto donor 2 
if and only if the nuclear spin was 1 0) . 



3.2 Readout performance 

In exactly the same manner as Chapter |21 the performance of the adiabatic 
state conversion stage of readout was simulated with variable nuclear and 
electronic dephasing times and Tg. The results of these simulations are 
shown in Figs. and show strong dependence on the initial computa- 

tional basis state. Indeed, we found the basis state |11) to be immune to 
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Fig. 3.2: Evolution of the hypcrfinc and exchange interaction strengths required to 
convert nuclear spin information into electron spin information, as shown 
in Eqs. j:ilH;i4ll . 

dephasing, and, since it is far from any level crossings, perfectly preserved 
during the adiabatic evolution. Consequently, no figure has been included 
for 1 11). 

At the other extreme, even in the absence of dephasing, the nuclear 
state |00) is converted into electron state \ae) with high probability of error 
e ~ 10~^. With dephasing, state |00) also embodies the worst-case fidelity of 
the readout operation. For realistic dephasing times, this suggests a fidelity 
of state preparation (before tunnelling an actual readout) of 10"'^. 

While this low fidelity is not ideal, it could probably be tolerated. There 
are, however, more serious concerns. Firstly, even with donors spaced 20nm 
apart and IV applied to the J-electrode, simulations suggest that the ex- 
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change coupling could be as weak as O.OSmeV, or 420 in units of gn^J"nBz 
|159j . This is insufficient to access the desired level crossing. 

Furthermore, the state of two electrons on one donor has a very weak 
binding energy of 1.7meV. Calculations suggest that a DC field designed to 
encourage electron 1 to tunnel onto donor 2 would also be sufficient to ionize 
the D- state [T33] . 

3.3 Conclusion 

Current simulations suggests that there are serious concerns with regard 
to the accessibility, fidelity, and stability of the states used in the adiabatic 
Kane readout proposal. Modifications to the scheme such as using an rf field 
to Rabi flip electron f dependent on nuclear spin 1 instead of the adiabatic 
process and using resonant fields to induce Rabi oscillations of electron 1 
onto donor 2 provided the transition is permitted by spin have been sug- 
gested as a way around these problems |133j . A completely new scheme 
involving three donors, one ionized, and avoiding the need for double occu- 
pancy has also been proposed jl6Uj . Further theoretical and experimental 
work is required to develop these ideas. 
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Fig. 3.3: Probability of error s during readout state preparation as a function of 
and T„ for input state (a) |00) and (b) |01). 
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Fig. 3.4: Probability of error e during readout state preparation as a function of Tg 
and T„ for input state |10). 



4. Implementing arbitrary 2-qubit gates 



Efficiently implementing arbitrary 2-qubit gates using a given physical ar- 
chitecture is an important part of practical quantum computation. In this 
chapter, we combine techniques from Refs jl61l 1161^ 11631 to present an 
efficient, but not necessarily time-optimal implementation of an arbitrary 
2-qubit gate using at most three periods of free evolution of a certain class 
of 2-qubit system interspersed with at most eight single-qubit gates. The 
construction requires that the architecture be able to isolate qubits from 
one another. This chapter should be considered clarification and review 
rather than original research. A number of examples of gates built using the 
method detailed in this chapter can be found in Ref. |144j . 

In Section [4.11 prior work on implementing quantum gates is reviewed. 
Section [4.21 contains essential terminology and notation. Given an arbitrary 
2-qubit gate G, Section 14.31 details how to construct a canonical decompo- 
sition comprised of four single-qubit gates and one purely non-local 2-qubit 
gate. Section r4.41 describes how to make the canonical decomposition unique. 
Section 14.51 uses the unique canonical decompositions of G and a given 2- 
qubit evolution operator U (t) to obtain a physical implementation of G using 
at most three periods of free evolution of U{t) and eight single-qubit gates. 
Section 14.61 concludes with a discussion of the implications of canonically 
decomposed gates, and describes further work. 
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4.1 Background 

Implementing arbitrary gates given an arbitrary architecture is a nontrivial 
task. General methods have been developed for two special classes of phys- 
ical architecture — those in which single-qubit gates can be approximated 
as instantaneous due to their speed relative to multiple qubit interactions, 
and those with the ability to isolate qubits from one another. When single- 
qubit gates are much faster than multiple-qubit interactions, time-optimal 
implementations can be found |165| I166| I164j . In the Kane architecture, 
single qubit gates cannot be approximated as instantaneous, but qubits can 
be isolated from one another |14Uj . so we focus on the efficient, but not 
necessarily time-optimal method described in |163j . 

In both cases, the canonical decomposition |1611 1161^1 11631 ITHl] forms the 
starting point of the construction. Given an arbitrary 2-qubit gate G, the 
canonical decomposition is a circuit, equivalent up to global phase, involving 
up to four single-qubit gates Gia, Gib, G2A, G2B and a purely non-local gate 
Q_, ^ ^i{eiX®x+e2Y®Y+e3Z®z) (Fig.Oa). Note that this decomposition is 

only unique if certain restrictions are placed on 0. A complete discussion of 
the many symmetries of the canonical decomposition is given in jl63j . 

Not all 2-qubit time evolution operators U{t) admit such a simple de- 
composition. The most general time-dependent decomposition has a 2-qubit 
term 

JJru. u^ u. u^ u. u^^ - piiMt)X(!iX+<f>2{t)Y(!$Y+Mt)Z^Z) .^ 
^ {<Pl{t),<P2(t),<t>3{t)) ^ ■ K^-'-J 

This case is dealt with in J163', but is not required here. Instead, we focus 
on time evolution operators that can be decomposed as U^^ (Fig- 14.1b ). 
Comparing 9 and (p, G can be expressed as at most three applications of U^^ 
interspersed with at most eight single-qubit gates (Fig. 14.11 ;). 
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Fig. 4.i; (a) Circuit equivalent to an arbitrary gate constructed via the canonical 
decomposition, (b) Similar equivalent circuit which exists for a restricted 
class of 2-qubit evolution operators, (c) Arbitrary gate expressed as at 
most three periods evolution of the 2-qubit evolution operator and eight 
single-qubit gates. 
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4.2 Terminology and notation 

This section contains the terminology and notation used in this chapter. 
Unless otherwise specified, all quantities in this chapter are expressed in the 
computational basis {|00), |01), |10), |11)}. Heavy use will also be made of 
the magic basis |167) 



l^i) 
1^2) 

|$4) 



^(|oo) + |ii)) 



V2 



(|00)-|11)) 



^(|oi)-|io)) 



^/2 



(|01) + |10)). 



(4.2) 



The transformation matrix from the magic basis to the computational basis 
is 



/ 



\ 



(4.3) 



1 -i 

1 -i 

-1 -i 

\l i J 

States l^*) and matrices G expressed in the magic basis will be denoted by 

1^) = Qt]^) and G = Q^GQ respectively. 

Given an arbitrary 2-qubit gate G, a canonical decomposition is a set 

of four single-qubit gates Gia,Gib,G2A,G2b and a purely non-local gate 
Q = ^iie^x^x+e2Y<^Y+esZ®z) ^^^^ ^-^^^ 



G ^ iG2A ® G2b)GJGia Gib), 



(4.4) 



where = denotes equality up to global phase. Two sets of non-local parame- 
ters 9, (f are called locally equivalent if there exist matrices Ui,U2,U^,U4 £ 



4.3. Constructing a canonical decomposition 
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U{2) such that = (C/3 (8) Ui)Gg{Ui U2)- The canonical decomposition 
can be made unique by requiring 61 > 62, tt/2 — 61 > 62, 62 > ^3, ^3 > 0, 
and 7r/2 -6*1 > 6*1 if 6*3 = 0. 

4.3 Constructing a canonical decomposition 

In this section, we take a 2-qubit gate G and construct a canonical de- 
composition G = {G2A G2B)Gg{GiA Gib)- Firstly, let Gs = e~''^G, 
6 = — arg(det(G))/4, so that det(Gs) = 1. Since arg only determines 
values up to 2mT, 6 is only determined up to n7r/2. Arbitrarily choose 
5 £ (— 7r/4, 7r/4]. As we shall see, det(G5) = 1 is required to enable the 
construction of G^. 

Next, we determine the eigensystem of G^Gs to obtain eigenvalues 
|g2«efe| g^j^^ eigenvectors Note that while G^Gs is symmetric and 

hence possesses a real, orthonormal eigenbasis, standard analytic and nu- 
merical methods of obtaining the eigenvectors in general only yield a linearly 
independent set. Hence, for the moment, we only assume that {l^'fc)} is lin- 
early independent. 

In Ref. |161j it was shown that 

{GlGsmk) = {GlGsr^\^k) = e-'^^M^fc), (4-5) 

implying G^^Gsl^fc)* = e^*^*^ By considering states (|*fc) ± |l'fc)*)/2 it 
can be seen that both the real and imaginary parts of are themselves 

eigenvectors with the same eigenvalue. It can be shown that a subset of 
{Re(l^'fc)), Im(l^'fc))} always exists that is linearly independent and forms 
a real eigenbasis. Using the Gram-Schmidt procedure then gives us a real, 
orthonormal eigenbasis which we redefine {|^fc)} to equal. Let Oi be the 
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matrix with rows equal to Note that Oi is orthogonal and hence 

|det(Oi)| = 1. If det(Oi) = -1, redefine \^i) to equal so that 

det(Oi) = 1. Let 



2e 






\ 





2it2 














\ 



(4.6) 



Note that as defined Oi diagonalizes G^Gs meaning CfgGs = O^G2eOi. 

Compute efc G (— 7r/2,7r/2] from the eigenvalues e^**^*^ of G^Gs- Since 
det(G5^G'5) = 1, J2k = ni:. If n > 0, subtract vr from the n largest values 
of Ek as we shall later use ^fc = to eliminate es and allow 6*1, 92, 0^ to 
be expressed in closed form in terms of ei, 62, £4- Note that the eigenvalues 
g2jefc g^j^g j^Q^ changed by this convenient redefinition. Similarly, if n < 0, 
add vr to the n most negative values of e^. Let O2 = GsOfG*.. It can be 
directly verified that O2 is special orthogonal. We now have Gs = 02G^i, 
and are close to obtaining a canonical decomposition. 

In [EH] it was shown that SO{i) = Q^^SUil) SU{2)Q, where Q is 
the transformation matrix of Eq. (jTS}. The generic form of an element of 
SU{2) SU{2) is 



/ 



aa 



al3 



ha 



hp 



\ 



-aP* aa* -hp* ha* 
-h*a -h*p a*a a*P 
\ h*P* -h*a* -a*P* a*a* J 

where a,h,a,P £ C, \a\'^ + |6p = 1, and |a|^ + 




a P 
^ -P* a* 

(4.7) 

1. After calculating 
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Oi = QOiQt e SU{2) (g) SU{2), two matrices Gia,Gib G SU{2) such that 
Oi = GiA Gib can be readily constructed from Eq. H4.7|) . For example, 
= {aa){aa*) — {a(3){—aP*). Matrices Gia, Gib are unique up to a possible 
mutual sign flip. Matrices G2A,G2b £ SU{2) such that O2 = G2A 'iS> G2B 
can similarly be constructed. 

All that remains to do is convert G^ into Gg = Q^GgQ, which is 

/ ^1(61-92+63) ^ 

gj(-0i+02+e3) 

e-^(''i+''2+^3) 



(4.8) 



e^('^i+''2-^3) ^ 
We therefore need 

ei = ^1-^2 + ^3, (4.9) 

€2 = -^1 + ^2 + ^3, (4.10) 

ea = -^1-^2-^3, (4.11) 

£4 = 61+62-63. (4.12) 

Since we have ensured that J2k^k = 0, Eqs (|4.9H4.12l) are consistent and 
can be inverted to give 

Oi = (ei + e4)/2 (4.13) 

62 = (e2 + e4)/2 (4.14) 

03 = (ei + e2)/2. (4.15) 

We now have a canonical decomposition G = Gs = {G2A ^ G2B)Gg{GiA ® 
Gib)- 
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4.4 Making a canonical deconaposition unique 

In this section, we explain how to modify Gia, G2A, G2A, G2B to make 6 
unique. As it stands, G (— 7r,7r]'^. By direct multiphcation it can be seen 
that 

G^e,±./2M,) = ±i(^®^)G(ei,^,2,e3) (4.16) 
G{e^fi2±Tv/2fi3) = ±«(^®^)G'(ei, 02,03) (4.17) 
G{eifi2,e-i±-n/2) = ±«(^ "Xi ^)G'(0i,02,03). (4.18) 

By appropriately redefining Gia, Gib, G2A-, G2B, G can be restricted to the 
range [0, 7r/2). 

As explained in beautiful detail in |163j . there are 24 locally equivalent 
sets of values of G [0,7r/2)'^. Given {61,62, 9s), the locally equivalent 
values are {6i,9j,9k), (7r/2 - 6'i,7r/2 - 6j,6k), (vr/2 - 61,6^,1^/2 - 6k), and 
{6i,TT /2 — 6j,TT /2 — 6k) where i, j, k, are permutations of 1, 2, 3. To be more 
constructive, let 



P12 




(4.19) 


Pl3 




(4.20) 


P23 




(4.21) 


M12 




(4.22) 


Mi3 


= e^l"^^ ®e-'^l^^ 


(4.23) 


M23 




(4.24) 



Given G(5i^, 02,6*3)1 that 



G{d2,9i,e3) - Pi2G(^Bifi2,ea)Pi2 



(4.25) 



4.5. Building gates out of physical interactions 
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-Pl3G(6I^,6»2,6»3)-Pl3 
-P23G(6i^,6»2,6»3)-f23 
^12^(^1,02,63)^12 

^23^(^1 ,02,03)M|3 



(4.26) 




(4.27) 



G{-02 -91,03 



(4.28) 



{-03,92-91 



(4.29) 



G{ei,^93,-92 



(4.30) 



Using Eqs (l^^^^jOU)) and (jUTHHIIHl), it is always possible to obtain 9 such 
that 6*1 > 02, tt/2 - 6^1 > 6*2, O2 > 63, 6*3 > 0, and 7r/2 - 61 > 9i if 6*3 = 0. To 
obtain this unique 9, use Eqs H4.25fH?^7|) to order 9^ such that > ^2 > 
6*3 > 0. If 9i > 7r/4 and 6^2 > vr/2 - 9i, use Eq. (Ii:^ and Eqs (|4.l6H4.17l) to 
obtain (7r/2 - 6*2, 7r/2 - 6*1, 6*3). Finally, again use Eqs (|T^KH07|) to order 
9k- This gives the unique 9 with the desired properties. 

For the remainder of the thesis, when talking about a canonical decom- 
position of a gate G, we mean Gia, Gib , G2A, G2B S U{2) and the unique 
9 described above such that G = {G2A G2B)Gg{GiA Gib) up to global 
phase. We have not restricted the single-qubit unitaries to be special as 
none of the standard single-qubit gates H, X, Z, S and T are special and 
we wish to express Gia, Gib, G2A and G2B in terms of standard gates 
whenever possible. 

4.5 Building gates out of physical interactions 

To complete the construction of an arbitrary gate G in terms of a 2-qubit 
evolution operator U{t) admitting a canonical decomposition with 2-qubit 
term Ur, and single-qubit gates, let 



G ^ {G2A^G2B)Gg-{GiA(^GiB), 
U{t) ^ {U2A{t)0U2B{t))Ur,{UiA{t)0UiB{t)), 



(4.31) 
(4.32) 
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and following [163j . consider 



^ ^i{4>iX(»X+i<l>2Y®Y+i<l>-iZ®Z)ti 

= gj(</>ltl-</>3t2+9!'3t3)^«l^+*(</'2il-9!'lt2-</>2t3)5^®5^+«(</>3tl+</>2t2-</>lt3)^'X)^, 



(4.33) 



With the conditions on (j), the equations 



</'l -<A3 <A3 

02 -<Pl -4>2 

03 02 -01 



^2 

V *3 y 



^1^ 

^2 

V^3y 



(4.34) 



are invertible. Note that this system of equations is neither unique nor 
special, as other matrices from Eqs ()4.25fH3n)) could have been used in 
Eq. (|4.33j) . For example 



X M23Mi3e*(<^i^®-^+*'?^2yc>3y+i032c>3Z)t2jy^t^^t 



12 ^'^^12 
13^23 



^ ^i(4>xXtsX+i<t>2Y®Y+i(j>3Z®Z)tx 

^ gi(</.lil-03i2-</'lt3)^®^+i(02tl+0lt2-02t3)5^®5^+i(03tl-02i2+</'3i3)^®^ 



(4.35) 



also leads to an invertible system of equations. 
After solving Eq. for ti, ^2, ^3, let 



Ui = Ul^{ti)GiA (4.36) 
U2 = Uls{h)GiB (4.37) 
Us = t/L(t2)e*"/^^e*"/^^[/i^(ti) (4.38) 



4.6. Conclusion 
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Us 



U5 



(t 3 ) e*"/^^ e'"/^^ e*"/^^ ^-^./4Z ^-^./4X jjt^ ^ 
G2B e-/4^ e-^'^/^^ e-^'^/^^ ^^^Ib 3 ) ■ 



(4.43) 



(4.41) 



(4.42) 



(4.40) 



(4.39) 



With these definitions, it can be verified by direct multipUcation that 



G ^ ([/7 Us)Uit3)iU5 U6)Uit2)iU3 Ui)Uih)iUi U2). (4.44) 



Note that arbitrary choices have entered the calculation at a number of 
points meaning the above construction is not unique and may not be optimal. 
It can, however, be shown that no general construction of this form can use 
fewer than three periods of evolution of U{t) and eight single qubit gates in 
the worst case jl63j . This suggests that the above implementation is close 
to optimal, and at worst efficient. 



The construction of this chapter enables arbitrary 2-qubit gates to be ex- 
pressed as a simple circuit involving at most eight single-qubit gates and 
three periods of evolution of a restricted class of 2-qubit interactions. This 
implies that if multiple single and 2-qubit gates are applied to the same pair 
of qubits, they should all be combined into a single compound gate to save 
time and reduce circuit complexity. Chapters El IHl and |H1 rely heavily on this 
technique to absorb swap gates into neighboring useful gates, dramatically 
simplifying the circuits described therein. 



4.6 Conclusion 
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5. 5-qubit QEC on an LNN QC 



The question has been raised as to how well quantum error correction (QEC) 
can be implemented on a linear nearest neighbor (LNN) quantum computer 
|168j due to the expectation that numerous swap gates will be required. 
Working out a way around this is important due to the large number of 
LNN architectures currently under investigation [03 EH ITIIH EHl 
nrni nm ITTHI im Uni IHni 1113. in this chapter, a quantum 
circuit implementing 5-qubit QEC on an LNN architecture is described. Our 
goal is to keep the error correction scheme as simple as possible to facilitate 
physical realization. In particular, fault-tolerance has not been built into 
the circuit to minimize its complexity and the required number of qubits. 
Despite the lack of fault-tolerance, we show that, for both a discrete and 
continuous error model, a threshold physical error rate exists below which 
the circuit reduces the probability of error in the protected logical qubit. 
We also determine the required physical error rate for the logical qubit to 
be 10 times and 100 times as reliable as a single unprotected qubit. 

This chapter is organized as follows. Firstly, explicit examples of canoni- 
cally decomposed compound gates incorporating the swap gates required on 
an LNN architecture are given in Section f5. 11 In Section f5. 21 the non- fault- 
tolerant 5-qubit QEC scheme is described and the LNN circuit presented. 
Simulations of the performance of the LNN scheme when subjected to both 
discrete and continuous errors are discussed in Section [5^ Section EH con- 
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eludes with a summary of all results and a description of further work. 



5.1 Compound gates 

As discussed in detail in Chapter 0J the canonical decomposition enables 
any 2-qubit gate G to be expressed (non-uniquely) in the form 



® G2b)G,-(Gia Gib) (5.1) 

where Gia,Gib,G2a,G2b G U{2) and 

^i{eiX(S)X+e2Y<s>Y+e3Z(S)Z)_ ^^ 2) 

Provided a quantum computer allows qubits to be isolated, and has a 2-qubit 
evolution operator U{t) admitting a canonical decomposition with 2-qubit 
term U^^, an implementation of G exists using at most three periods of free 
evolution of U{t) and eight single-qubit gates. 

Fig. 15. lb shows the form of a canonically decomposed cnot on a Kane 
quantum computer |931 1144j . Z-rotations have been represented by quar- 
ter, half and three-quarter circles corresponding to Rz{tt/2), Rz^tt), and 
-Rz(3vr/2) respectively, where 

Rz = (5.3) 



Full circles represent Z-rotations of angle dependent on the physical con- 
struction of the computer (static magnetic field, phosphorus donor place- 
ment etc). The details of obtaining the canonical decomposition of the Kane 
2-qubit evolution operator contained in |144j . Up to a couple of Z-rotations, 
the 2-qubit interaction corresponds to = (j)2 = vr/n, and (ps = 0. Square 



5.2. 5-qubit LNN QEC 
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Fig. 5.1: Decomposition into physical operations of (a) CNOT and (b) Hadamard, 
CNOT then swap. Note that the Kane architecture has been used for 
illustrative purposes. 



gates 1 and 2 correspond to X-rotations Rx{t^) and Rx{tt/2). Fig. l5.lb shows 
an implementation of the composite gate Hadamard followed by CNOT fol- 
lowed by swap. Note that the total time of the compound gate is significantly 
less than the CNOT on its own. This fact has been used to minimize the total 
execution time of the LNN circuit of Fig. 15.2b . 

The above implies that the swaps inevitably required in an LNN ar- 
chitecture to bring qubits together to be interacted can, in some cases, be 
incorporated into other gates without additional cost. On any architecture, 
canonically decomposed compound gates should be used whenever multiple 
single and 2-qubit gates are applied to the same two qubits. 



5.2 5-qubit LNN QEC 

5-qubit QEC schemes are designed to correct a single arbitrary error. No 
QEC scheme designed to correct a single arbitrary error can use less than five 
qubits [SOI- A number of 5-qubit QEC proposals exist [T78l ITTni [TT2l ITWIl 
llHlj . Fig. 15.2b shows a non-fault-tolerant circuit appropriate for an LNN 
architecture implementing the encode stage of the QEC scheme proposed in 
|178j . For reference, the original circuit is shown in Fig. 15.2b ,. Note that the 
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Fig. 5.2: (a) 5-qubit encoding circuit for general architecture, (b) equivalent cir- 
cuit for linear nearest neighbor architecture with dashed boxes indicating 
compound gates. CNOT gates that must be performed sequentially are 
numbered. 



LNN circuit uses exactly the same number of cnots and achieves minimal 
depth since the cnot gates numbered 1~6 in Fig. 15.2b must be performed 
sequentially on any architecture that can only interact pairs of qubits (not 
three or more at once). The two extra "naked" swaps in Fig. 15.2b do not 
significantly add to the total time of the circuit. Fig. l5.Hl shows an equivalent 
circuit broken into physical operations for a Kane quantum computer. Note 
that this circuit uses the fact that if two 2-qubit gates share a qubit then 
two single-qubit unitaries can be combined as shown in Fig. 15.41 The decode 
circuit is simply the encode circuit run backwards. 5-qubit QEC schemes are 
primarily useful for data storage due to the impossibility of fault-tolerantly 
interacting two logical qubits jllSj . though with some effort it is possible 
to nontrivially interact three logical qubits. Fig. 15.51 shows a full encode- 
wait-decode-measure-correct data storage cycle. Table l5T] shows the range 
of possible measurements and the action required in each case. 



5.3 Simulation of performance 

When simulating the QEC cycle, the LNN circuit of Fig. 15.2b was used to 
keep the analysis independent of the specific architecture used. Each com- 
pound gate was modelled as taking the same time, allowing the time T to 
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Fig. 5.3: A sequence of physical gates implementing the circuit of Fig. 15.2b . Note 
the Kane architecture has been used for illustrative purposes. 
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Fig. 5.4: Circuit equivalence used to reduce the number of physical gates in Fig. 15.31 
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Fig. 5.5: A complete encode- wait-decode-measure-correct QEC cycle. 
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Tab. 5.1: Action required to correct the data qubit vs measured value of ancilla 
qubits. Note that the X-operations simply reset the ancilla. 



be made an integer sTieh that each gate takes one time step. Gates were fur- 
thermore simulated as though perfectly reliable and errors applied to each 
qubit (including idle qubits) at the end of each time step. The rationale for 
including idle qubits is that, in an LNN architecture, active physical manip- 
ulation of some description is frequently required to decouple neighboring 
qubits. Both the manipulation itself and the degree of decoupling are likely 
to be imperfect, leading to errors. Furthermore, in schemes utilizing global 
electromagnetic fields to manipulate active qubits, supposedly idle qubits 
may not be sufficiently off resonant. 

Two error models were used — discrete and continuous. In the discrete 
model, a qubit can suffer either a bit-flip (X), phase-flip (Z) or both simul- 
taneously (XZ). Each type of error is equally likely with total probability 
of error p per qubit per time step. The continuous error model involves 
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applying single-qubit unitary operations of the form 



cos(0/2)e*("+^)/2 sin(0/2)e'(«-^)/2 
^ - sin(^/2)e*(-°+^)/2 cos(^/2)e*(-°-^)/2 



(5.4) 



where a, (3, and are normahy distributed about with standard deviation 
a. 

Both the single-qubit and single logical qubit (five qubit) systems were 
simulated. The initial state 



was used in both cases since = 0.5, = 0.5, and 

K^flXZl^*)!^ = thus allowing each type of error to be detected (but not 
necessarily distinguished) . Simpler states such as |0), |1), (|0) + |l))/\/2, and 
(|0) — |l))/\/2 do not have this property. For example, the states |0) and |1) 
are insensitive to phase errors, whereas the other two states are insensitive 
to bit flip errors. 

Let Tyjait denote the duration of the wait stage. Note that the total 
duration of the encode, decode, measure and correct stages is 14. In the 
QEC case the total time T = T^ait + 14 of one QEC cycle was varied to 
determine the time that minimizes the error per time step 



where t final — 

1- and I*') is the final data qubit state. An optimal 

time Tapt exists since the logical qubit is only protected during the wait stage 
and the correction process can only cope with one error. If the wait time is 
zero, extra complexity has been added but no corrective ability. Similarly, if 



l^') = sin(7r/8)|0) +cos(7r/8)|l) 



(5.5) 




(5.6) 
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p 






^step/P 






1 7 V 1 


1 7 V 1 no 


l.D X lU 


/in 


i.D X lU 


l.U X iu 


10-3 


50 


8.4 X 10-4 


8.4 X 10-1 


10-4 


150 


3.1 X 10-5 


3.1 X 10-1 


10-5 


500 


1.0 X 10-6 


1.0 X 10-1 


10-6 


1500 


3.2 X 10-8 


3.2 X 10-2 


10-7 


5000 


1.0 X 10-9 


1.0 X 10-2 


10-8 


10000 


2.0 X 10-11 


2.0 X 10-3 



Tab. 5.2: Probability per time step estep of the logical qubit being destroyed when 
using 5-qubit QEC vs physical probability p per qubit per time step of a 
discrete error. 



the wait time is very large, it is almost certain that more than one error will 
occur, resulting in the qubit being destroyed during the correction process. 
Somewhere between these two extremes is a wait time that minimizes egtep- 
This property of non-fault-tolerant QEC has been noted previously |182j . 

Table ini21 shows Topt, ^step and the reduction in error estep/p versus p for 
discrete errors. Table shows the corresponding data for continuous errors. 
Note that, in the continuous case, the single qubit p has been obtained via 
1-qubit simulations using the indicated a and wait time T = Topt + 14 and 
a 1-qubit version of Eq. 1)5. 6j) 

P=l - ^jl - e final (5.7) 

where efinai = 1 — K^'I^)P and l^'') is the final single-qubit state. In this 
context, p is the discrete error rate yielding the same final error probability 
as the corresponding a over time T. 

The threshold p = 1.6 x 10-^ shown in Table 1^?^ is comparable to some of 
the highest thresholds of fault-tolerant quantum computation described in 
Chapter^ which were obtained using weaker noise models and architectures 
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Tab. 5.3: Probability per time step estep of the logical qubit being destroyed when 
using 5-qubit QEC vs standard deviation u of continuous errors. 

able to interact arbitrary pairs of qubits. If an error rate improvement of 
a factor of 10 or 100 is desired when using our scheme, then p = 10-^ or 
p = 10-'' is required respectively. Note that unlike fault-tolerant schemes, 
the error rate of the logical qubit does not scale as cp^ . 

For continuous errors, the threshold standard deviation is a = 4.7 x 10-2. 
The logical qubit is a factor of 10 more reliable than a single physical qubit 
for fj = 3.6 X 10-3. A factor of 100 improvement is achieved when a = 
4.0 X 10-4. 

5.4 Conclusion 

To summarize, we have presented a non-fault-tolerant circuit implementing 
5-qubit QEC on an LNN architecture that achieves the same depth as the 
current least depth circuit jl78j . and simulated its effectiveness against both 
discrete and continuous errors. For the discrete error model, if error cor- 
rection is to provide an error rate reduction of a factor of 10 or 100, the 
physical error rate p must be 10-^ or 10-^ respectively. The corresponding 
figures for the continuous error model are a = 3.6 x 10-3 and 4.0 x 10-4. 
Further work is required to determine whether the discrete or continuous 
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error model or some other model best describes errors in physical quantum 
computers. The relationship between the two error models also warrants 
further investigation. Further simulation is required to determine the er- 
ror thresholds and scaling associated with single and 2-qubit LNN QEC 
protected gates. 



6. QEC without measurement 



In Chapter [3 we described and analyzed an explicit 5-qubit non-fault- 
tolerant LNN QEC scheme. In this chapter, we wish to relax the engineering 
requirements of this scheme further. In particular, for many architectures 
the most difficult aspect of quantum error correction is measuring qubits 
quickly and/or reliably. Interactive classical processing of measurement re- 
sults can also be problematic. We therefore present an explicit 5-qubit QEC 
scheme that only requires slow resetting and no classical processing at the 
cost of halving the performance of the original approach and potentially 
requiring an additional 4 ancilla qubits. Prior work exists on removing mea- 
surement from fault-tolerant QEC [124| I183j . but this approach requires 
many more qubits than the scheme presented here. A similar no measure- 
ment non-fault-tolerant LNN QEC scheme has been devised and physically 
implemented using liquid NMR technology |179j . but is not directly appli- 
cable to other technologies, and only permits a single cycle of QEC to be 
performed. 

The discussion is organized as follows. In Section 16.11 the concept of 
resetting as distinct from measurement is explained in more detail. In Sec- 
tion IHISl a QEC scheme requiring fast resetting and no classical processing 
is described and its performance simulated. In Section EIHl additional qubits 
are added to the circuit to enable the use of a slow reset operation. Sec- 
tion 1231 concludes with a summary of our results and a description of further 
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Fig. 6.1: Example of a physical model of resetting through relaxation. Given a 
double potential well with left and right occupancy representing |0) and 
|1) respectively, resetting to |0) can be achieved by lowering the barrier 
and applying a bias. 



work. 



6. 1 Resetting 

Resetting is distinct from measurement in that a given qubit is in a known 
state after resetting but no information about its state beforehand is pro- 
vided. A physical example of resetting is provided by a double quantum dot 
system separated by a potential barrier in which |0) is represented by an 
electron in the left dot, and |1) by an electron in the right dot. By apply- 
ing an electric field across the double dot system and lowering the barrier 
potential, the electron can be encouraged to relax into the |0) state. This 
process is illustrated in Fig. 16.11 
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Fig. 6.2: Quantum circuit acting on a recently decoded logical qubit to correct the 
data qubit based on the value of the ancilla qubits. Hollow dots represent 
control qubits that must be |0) for the attached gate to be applied. 

6.2 5-qubit QEC without measurement 

To eliminate measurement from Fig. 15.51 the implicit classical logic that 
converts the measured result into a corrective action must be converted 
into quantum logic gates. Fig. 16.21 shows a quantum circuit performing 
the necessary logic. The first half of the circuit rearranges the states 0000 
to 1111 such that the required corrective action is as shown in column 3 
of Table 16.11 This rearrangement of actions leads to the relatively simple 
corrective logic of the second half of Fig. 16.21 

In keeping with Chapter [3 an LNN version of Fig. 16.21 has been devised 
and is shown in Fig. 16.31 The LNN circuit was used in all simulations. 

Given the increased complexity of Fig. 16.31 compared with Fig. 15.51 it is 
expected that the threshold error rates for both the discrete and continuous 
error models will be lower. Furthermore, the optimal wait time between 
correction cycles it is expected to be longer to balance the need for correction 
against the longer period of vulnerability during correction. Both of these 
effects can be observed in Tables IFT^ and HUH 

For discrete errors, the new threshold error rate, and the error rates at 
which a factor of 10 and 100 improvement in reliability are achieved are 



70 



6. QEC without measurement 



Ancilla 


Action 


Action 


0000 


I 


I 


0001 


I 


I 


0010 


I 


I 


0011 


Z 


I 


0100 


I 


xz 


0101 


X 


x 


0110 


z 


z 


0111 


X 


I 


1000 


z 


X 


1001 


I 


z 


1010 


X 


X 


1011 


X 


z 


1100 


z 


X 


1101 


X 


z 


1110 


xz 


X 


1111 


z 


z 



Tab. 6.1: Second column shows the action required to correct the data qubit given 
a certain ancilla value immediately after decoding. Third column shows 
the action required to correct the data qubit given a certain ancilla value 
after the application of the first half of Fig. 16.21 
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p 


Topt 


(-step 


(step/P 


10-3 


120 


1.4 X 10" 


-3 


1.4 X 10° 


3.7 X 10-4 


190 


3.7 X 10" 


-4 


1.0 X 10° 


10-4 


320 


6.0 X 10" 


-5 


6.0 X 10-^ 


10-5 


1100 


2.1 X 10" 


-6 


2.1 X 10-^ 


2.2 X 10-6 


2300 


2.2 X 10" 


-7 


1.0 X 10-^ 


10-6 


3000 


7.1 X 10" 


-8 


7.1 X 10-2 


10-^ 


15000 


1.9 X 10" 


-9 


1.9 X 10-2 


2.1 X 10-s 


40000 


2.1 X 10" 


10 


1.0 X 10-2 



Tab. 6.2: Probability per time step tstep of the logical qubit being destroyed when 
using no measurement 5-qubit QEC vs physical probability p per qubit 
per time step of a discrete error. 



a 


Topt 


P 


(step 


(step/P 


3.1 X 10-2 


1.0 X 102 


4.6 X 10" 


-4 


4.6 X 10" 


-4 


1.0 X 10° 


10-2 


3.2 X 102 


4.9 X 10" 


-5 


2.2 X 10" 


-5 


4.5 X 10-1 


2.0 X 10^3 


1.8 X 103 


2.0 X 10" 


-6 


2.0 X 10" 


-7 


1.0 X 10-1 


10-3 


3.7 X 103 


5.0 X 10" 


-7 


2.4 X 10" 


-8 


4.8 X 10-2 


2.1 X 10-4 


2.4 X 104 


2.1 X 10" 


-8 


2.1 X 10" 


10 


1.0 X 10-2 


10-4 


5.0 X 104 


5.0 X 10- 


-9 


1.8 X 10- 


11 


3.6 X 10-3 


10-5 


4.0 X 105 


5.1 X 10- 


11 


1.5 X 10- 


14 


2.9 X 10-4 



Tab. 6.3: Probability per time step e^tep of the logical qubit being destroyed when 
using no measurement 5-qubit QEC vs standard deviation a of continuous 



errors. 
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p = 3.7 X IQ-^, 2.2 X lO"'^, and 2.1 x 10"^ respectively. Note that these 
are approximately a factor of 5 less than the corresponding results obtained 
in Chapter 13 For continuous errors, the pertinent standard deviations are 
fj = 3.1 X 10^^, 2.0 X 10^^, and 2.1 x 10^^. These are very comparable to 
the results obtained in Chapter |S1 being at most a factor of 2 less. 

6.3 5-qubit QEC with slow resetting 

The use of Fig. 16.31 in a 5-qubit system assumes the reset operation is fast 
(comparable to the time required to implement a single quantum gate). This 
requirement can be eliminated with the addition of four ancilla qubits as 
shown in Fig. 16.41 By re-encoding with fresh ancilla, the reset operation may 
now take an amount of time equal to an entire QEC cycle. From Tables ESI 
and 16. 31 depending on the physical error rate, this can be thousands of times 
longer than a single gate operation. 

6.4 Conclusion 

To summarize, we have shown that even without fault-tolerance, fast mea- 
surement, classical processing, and large numbers of qubits, it is still possible 
to construct a general quantum error correction scheme with a reasonably 
high (p = 3.7 X 10^^ or (T = 3.1 X 10^2) threshold error rate. 
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7. Shor's algorithm 



Chapters IHl ini and to a lesser extent Chapter deal with aspects of imple- 
menting Shor's algorithm [Sj I184j . This chapter provides a detailed review 
of the algorithm. In addition to Shor's papers, we draw heavily on Ref. |3nj . 

When Shor's algorithm was published in 1994, it was greeted with great 
excitement due to its potential to break the popular RSA encryption pro- 
tocol 7 . RSA is used in all aspects of e-commerce from Internet banking 
to secure online payment and can also be used to facilitate secure message 
transmission. The security of RSA is conditional on large integers being 
difficult to factorize, which has so far proven to be the case when using clas- 
sical computers. Given a quantum computer, Shor's algorithm renders the 
integer factoring problem tractable. 

To be precise, let = N1N2 be a product of prime numbers. Let 
L = ln2 N be the binary length of N. Given N, Shor's algorithm enables 
the determination of Ni and N2 in a time polynomial in L. This is achieved 
indirectly by finding the period r of f{k) = m'^ mod A^, where 1 < m < A, 
gcd(m, N) = 1 and gcd denotes the greatest common divisor. Provided 
r is even and f{r/2) 7^ A — 1, the factors are Ai = gcd(/(r/2) -|- 1, A) 
and A2 = gcd(/(r/2) — 1,A^). Note that the greatest common divisor can 
be computed in a time linear in L using a classical computer. For odd A 
and randomly selected m such that 1 < m < A and gcd(m, A^) = 1, the 
probability that f{k) has a suitable r is at least 0.75 jHOl- Thus on average 
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Circuit 


Qubits 


Depth 


Beauregard jl91j 


~ 2L 


~ 32L3 


Vedral [Hn] 


5L 


240L3 


Zalka 1 [Mil 


~ 5L 


~ 3000L2 


Zalka 2 


~ SOL 


~ 219^1-2 


Van Meter [THK] 


0{L') 


0(Llog2L) 



Tab. 7.1." Required number of qubits and circuit depth of different implementations 
of the quantum part of Shor's algorithm. Where possible, figures are 
accurate to leading order in L. 

very few values of m need to be tested to factor N. 

The quantum part of Shor's algorithm can be viewed as a subroutine that 
generates numbers of the form j ~ c2^^/r. To distinguish this from the nec- 
essary classical pre- and post-processing, this subroutine will be referred to 
as quantum period finding (QPF). Due to decoherence and imprecise gates, 
the probability s that QPF will successfully generate useful data (defined 
precisely below) may be quite low with many repetitions required to work 
out the period r of a given f{k) = mod A^. Using this terminology, 
Shor's algorithm consists of classical preprocessing, potentially many repe- 
titions of QPF with classical postprocessing and possibly a small number 
of repetitions of the entire cycle followed by more classical post-processing 
(Fig.Ell). 

The efficiency of QPF can only be quantified with reference to a specific 
quantum circuit implementation. To date, the most thorough work on dif- 
ferent quantum circuit implementations has been performed by Van Meter 
|185j drawing on work by Vedral |186j , Beckman jl87j , Gossett |188j , Draper 
|189j and Zalka |19flj . Table [TtI gives representative examples of the variety 
of circuits in existence and their qubit counts and depths as a function of L. 
Note that generally speaking time can be saved at the cost of more qubits. 

An underlying procedure common to all implementations does exist. The 
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Select I <m<N 
such that gcd(m, AO-1 
(classical) 




< 1 








' Try to find j ~ c2-^/r ^ 
\^ (quantum) ^ 










Try to use j to find 
period r of f{k)=m^ mod 
(classical) 


Fail 
Fail 




Success 


Test whether r is even and 
m'''^ mod ±1 mod 
(classical) 






Success 


A^l =gcd(m''/2-l, AO 

Af2 = gcd(m'"/2+l,A0 
(classical) 



Fig. 7.1: The complete Shor's algorithm including classical pre- and postprocessing. 

The first branch is highly likely to fail, resulting in many repetitions of 
the quantum heart of the algorithm, whereas the second branch is highly 
likely to succeed. 
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first common step involves initializing the quantum computer to a single pure 
state |0)2l|0)l. Note that for clarity the computer state has been broken 
into a 2L qubit k register and an L qubit / register. The meaning of this 
will become clearer below. 

Step two is to Hadamard transform each qubit in the fc-register yielding 

^ E \k)2L\0)L. (7.1) 

k=Q 

Step three is to calculate and store the corresponding values of f{k) in 
the /-register 

^ E \k)2L\fik))L. (7.2) 
fe=0 

Note that this step requires additional ancilla qubits. The exact number 

depends heavily on the circviit used. 

Step four can actually be omitted but it explicitly shows the origin of 
the period r being sought. Measuring the /-register yields 

^ E \ko + nr)2L\fM)L (7.3) 

n=0 

where A:o is the smallest value of k such that f{k) equals the measured value 
/m- 

Step five is to apply the quantum Fourier transform 



2 —1 

1^)-^ E «-p(^^-^)b-) (7-4) 

j=0 



to the fc-register resulting in 



22^-122^-/^-1 . 

|J E E exp(-5i(fco+pr))|i)2L|/M)L. (7.5) 

j=0 p=0 
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The meaning of equation 17.51 is best illustrated by reversing the order of the 
summation 

H exp(-2|;i(A:o +pr)j|i)2L|/Af>L. (7.6) 
The probability of measuring a given value of j is thus 



Pr(j, r, L) 



p=0 



(7.7) 



If r divides 2^^, Eq. dTTj) can be evaluated exactly. In this case the 
probability of observing j = c2^^/r for some integer < c < r is 1/r 
whereas if j ^ c2'^^/r the probability is 0. This situation is illustrated 
in Fig. 17.2b . However if r divides 2"^^ exactly a quantum computer is not 
needed as r would then be a power of 2 and easily calculable. When r is 
not a power of 2 the perfect peaks of Fig. 17.2b . become slightly broader as 
shown in Fig. 17.2b . All one can then say is that with high probability the 
value j measured will satisfy j ~ c2^-^/r for some < c < r. 

Given a measurement j ~ c2^^/r with c 7^ 0, classical postprocessing is 
required to extract information about r. The process begins with a continued 
fraction expansion. To illustrate, consider factoring 143 (L = 8). Suppose 
we choose m equal 2 and the output j of QPF is 31674. The relation 
j ~ c2'^^/r becomes 31674 ~ c65536/r. The continued fraction expansion 
of c/r is 



31674 1 1 



cccqe 32768 n , 1094 o , 1 

DDOJD ^ + 15837 + 14+ 



(7.8) 



'^10+1/52 

The continued fraction expansion of any number between and 1 is com- 
pletely specified by the list of denominators which in this case is {2, 14, 2, 10, 52}. 
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Pr(j) 
0.125 1 



(a) 



32 64 96 128 160 192 224 



Pr{j) 



0.1 



(b) 



26 51 77 102 128 154 179 205 230 



Fig. 7.2: Probability of different measurements j at the end of quantum period 
finding with total number of states 2^^ = 256 and (a) period r = 8, (b) 
period r — 10. 



The nth convergent of a continued fraction expansion is the proper fraction 
equivalent to the first n elements of this list. 



{2} 
{2,14} 
{2,14,2} 
{2,14,2,10} 
{2,14,2,10,52} 



2 

14 

29 
29 

60 
304 

629 
15837 

32768 



(7.9) 



An introductory exposition and further properties of continued fractions are 
described in Ref. ^j. The period r can be sought by substituting each 
denominator into the function f{k) = 2^ mod 143. With high probability 
only the largest denominator less than 2^ will be of interest. In this case 
26O jj^Q^j ]^43 — I hence r = 60. 
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Two modifications to the above are required. Firstly, if c and r have 
common factors, none of the denominators wih be the period but rather 
one will be a divisor of r. After repeating QPF a number of times, let 
{jm} denote the set of measured values. Let {cmn/dmn} denote the set of 
convergents associated with each measured value {jm}- If a pair Cmn, Cm'n' 
exists such that gcd{cmn,Cm'n') = 1 and dmn, dm'n' are divisors of r then 
r = \cm(dmm dm'n')j where 1cm denotes the least common multiple. It can 
be shown that given any two divisors dmn^ d^'n' with corresponding Cmn^ 
Cm'n' the probability that gcd{cmn, Cm'n') = 1 is at least 1/4 I^BIJI. Thus, on 
average, only a small number of different divisors are required. In practice, 
it will not be known which denominators are divisors so every pair dmn, 
dm'n' with gcd{cmn, Cm'n') = 1 niust be tested. 

The second modification is simply allowing for the possibility that the 
output j of QPF may be useless. Let s denote the probability that j = 
[c22^/rj or \c2^^/r] for some < c < r where [J , [] denote rounding down 
and up respectively. Such values of j will be called useful as the denomina- 
tors of the associated convergents are guaranteed to include a divisor of r 
pUj . To obtain a divisor of r, 0{l/s) runs of QPF must be performed. 

To summarize, as each new value jm is measured, the denominators 
dmn less than 2^ of the convergents of the continued fraction expansion of 
jm/'^'^^ are substituted into f{k) = m'^ mod N to determine whether any 
f{dmn) = 1 which would imply that r = dmn- If not, every pair dmn, 
dm'n' with associated numerators Cmn, Cm'n' satisfying gcd{cmn, Cm'n') = 1 
is tested to see whether r = lcin{dmn,dm'n')- Note that as shown in Fief. 17.11 
if r is even or m^^^ mod = ±1 mod N then the entire process needs to 
be repeated 0(1) times. Thus Shor's algorithm always succeeds provided 
0{l/s) runs of QPF can be performed. Note that if s is too small, it may 
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not be possible to repeat QPF 0(l/s) times in a practical amount of time. 



8. Shor's algorithm on an LNN QC 



Implementing Shor's factorization algorithm |Sl I184j is arguably the ulti- 
mate goal of much experimental quantum computer research. A necessary 
test of any quantum computer proposal is therefore whether or not it can 
implement quantum period finding (QPF) as described in Chapter [3 While 
several different quantum circuits implementing QPF have been designed 
(Table I7.1() , most tacitly assume that arbitrary pairs of qubits within the 
computer can be interacted. As discussed in Chapter El a large number 
of promising proposals, including the Kane quantum computer, are best 
suited to realizing a single line of qubits with nearest neighbor interactions 
only. Determining whether these linear nearest neighbor (LNN) architec- 
tures can implement QPF in particular and quantum algorithms in general 
in a practical manner is a nontrivial and important question. In this chapter 
we present a circuit implementing QPF designed for an LNN QC. As the 
rest of Shor's algorithm is classical, this implies that Shor's algorithm can 
be implemented on an LNN QC. Despite the interaction restrictions, the 
circuit presented uses just 2L + 4 qubits and to leading order requires 8L^ 
gates arranged in a circuit of depth 32L^ — identical to leading order to 
the Beauregard circuit |191j upon which this work is based. Note that the 
original Beauregard's circuit used just 2L + 3 qubits, but with the extra 
qubit repeated Toffoli gates can be implemented more quickly, reducing the 
overall depth of the circuit by a factor of 4. The precise differences between 
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the LNN and Beauregard circuit are detailed throughout the chapter. 

As controlHng large numbers of qubits has so far proven to be extraordi- 
narily difficult, this work places emphasis firstly on minimizing the required 
number of qubits. Secondly, the depth has been minimized over the to- 
tal gate count in an effort to reduce the need for quantum error correction 
assuming the primary source of error will be decoherence rather than the 
quantum gates themselves. If gate errors dominate, a higher depth but lower 
gate count circuit would be preferable jl86j . 

The chapter is structured as follows. In Section [8.11 Shor's algorithm is 
broken into a series of simple tasks appropriate for direct translation into 
circuits. Sections 18.21 to WM then present, in order of increasing complexity, 
the LNN quantum circuits that together comprise the LNN Shor quantum 
circuit. The LNN quantum Fourier transform (QFT) is presented first, 
followed by a modular addition, the controlled swap, modular multiplication, 
and finally the complete circuit. Section 18.71 contains a summary of all 
results, and a description of further work. 

8.1 Decomposing Shor's algorithm 

The purpose of this section is to break Shor's algorithm into a series of steps 
that can be easily implemented as quantum circuits. Neglecting the classi- 
cal computations and optional measurement step described in the previous 
chapter, Shor's algorithm has already been broken into four steps. 

1. Hadamard transform. 

2. Modular exponentiation. 

3. Quantum Fourier transform. 

4. Measurement. 
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The modular exponentiation step is the only one that requires further de- 
composition. 

The calculation of f{k) = mod N is firstly broken up into a series of 
controlled modular multiplications. 

2L-1 

f{k) = II {m'^'^^ modN), (8.1) 
i=0 

where ki denotes the ith bit of k. If fcj = 1 the multiplication m^' mod N 
occurs, and if fcj = nothing happens. 

There are many different ways to implement controlled modular multi- 
plication (Table U~T^ . The methods of |191j require the fewest qubits and 
will be used here. To illustrate how each controlled modular multiplication 
proceeds, let a{i) = rn?^ mod and 

x{i) = mod A^). (8.2) 

i=o 

x{i) represents a partially completed modular exponentiation and a{i) the 
next term to multiply by. Let |x(i), 0) denote a quantum register containing 
x{i) and another of equal size containing 0. Firstly, add a{i) modularly 
multiplied by the first register to second register if and only if (iff) ki = 1. 

\x(i),0) I— > \x{i), + a{i)x{i) mod N) 

= \x{r),x{i + l)). (8.3) 

Secondly, swap the registers iff /cj = 1. 



x{i),x{i + 1)) ^ \x{i + l),x{i)) 



(8.4) 
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Thirdly, subtract a{i) ^ modularly multiplied by the first register from the 
second register iff fc, = 1. 

\x{i + l),x{i)) 
I— \x{i + 1), x{i) — a{i)~^x{i + 1) mod N) 
= \x{i + l),0). (8.5) 

Note that while nothing happens if ki = 0, by the definition of x(i) the final 
state in this case will still be \x{i + 1), 0). 



The first and third steps described in the previous paragraph are fur- 
ther broken up into series of controlled modular additions and subtractions 
respectively. 

L-1 

+ a{i)x(^) = a{i)2^x{i)j mod N, (8.6) 

j=0 
L-1 

x{i) - a(?)-^a;(i + 1) = x{i) - ^ a(i)-'^2^x{i + l)j mod N, (8.7) 

3=0 

where x{i)j and x{i + l)j denote the jth bit of x{i) and x{i + 1) respec- 
tively. Note that the additions associated with a given x{i)j can only occur 
if x{i)j = 1 and similarly for the subtractions. Given that these additions 
and subtractions form a multiplication that is conditional on ki, it is also 
necessary that fcj = 1. 



Further decomposition will be left for subsequent sections. 



8.2. Quantum Fourier Transform 
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(b) 
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^ ^(b)3 

— m, 

^(b)o 



Fig. 8.1: (a) Standard quantum Fourier transform circuit, (b) An equivalent linear 
nearest neighbor circuit. 

8.2 Quantum Fourier Transform 

The first circuit that needs to be described, as it will be used in all subsequent 
circuits, is the QFT. 

1^) ^ 4r E exp(27rijA;/2^)|j) (8.8) 

Fig. 18.1b shows the usual circuit design for an architecture that can in- 
teract arbitrary pairs of qubits. Fig. 18.1b shows the same circuit rearranged 
with the aid of swap gates to allow it to be implemented on an LNN architec- 
ture. Note that the general QFT circuit inverts the most significant to least 
significant ordering of the qubits whereas the LNN circuit does not. Dashed 
boxes indicate compound gates implemented with the aid of the canoni- 
cal decomposition. To emphasis the advantage of using compound gates, 
Fig. 18.21 contains a comparison of a single swap gate with a Hadamard gate 
followed by a controlled phase rotation, followed by a swap gate. 
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(a) 
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(b) 
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^ = R2(7l/4) 



^ = R^(57t/4) = Rz(9) 



Fig. 8.2: (a) Swap gate expressed as a sequence of physical operations via the canon- 
ical decomposition, (b) Similarly decomposed compound gate consisting 
of a Hadamard gate, controlled phase rotation, and swap gate. Note that 
the Kane architecture has been used for illustrative purposes. 



Counting compound gates as one, the total number of gates required to 
implement a QFT on L qubits for both the general and LNN architectures 
is L(L — 1)/2. Assuming gates can be implemented in parallel, the minimum 
circuit depth for both is 2L — 3. Note that for large L it is both necessary and 
possible for nearly all of the exponentially small controlled rotation gates 
to be omitted jl92l 1136] . Omitting these gates does not, however, enable a 
reduction in the depth of the circuit. This point will be discussed in detail 
in Chapter ini Furthermore, in the LNN case the swap gates associated with 
omitted controlled rotations must remain for the circuit to work so the gate 
count also remains unchanged. 
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4>(a+b)2 
(|)(a+b)i 
(^(a+b)„ 
Cont 

Fig. 8.3: (a) Quantum Fourier addition, (b) Controlled quantum Fourier addition 
and its symbolic equivalent circuit. Cont + a denotes the addition of a if 
Cont = 1. 

8.3 Modular Addition 

Given a quantum register containing an arbitrary superposition of binary 
numbers, there is a particularly easy way to add a binary number to each 
number in the superposition jl93l \Wf\ . By quantum Fourier transforming 
the superposition, the addition can be performed simply by applying appro- 
priate single-qubit rotations as shown in Fig. \H.'Ah . Such an addition can 
also very easily be made dependant on a single control qubit as shown in 
Fig. ESb- 

Performing controlled modular addition is considerably more compli- 
cated as shown in Fig. 18.41 This circuit adds 2^m?' mod N iff both x{i)j 
and ki are 1 to the register containing (/>(&) to obtain (j)[c) where c = 
(6 -|- 2^m?') mod N. Note that the register containing 0(6) is -L -|- 1 qubits 
in length to prevent overflow at any stage of the computation. 

The first five gates comprise a Toffoli gate that sets A;x = 1 iff x{i)j = 
ki = 1. ki and x{i)j are defined in Eq. (|8.1|) and Eqs (|8.6H8.7j) respectively. 
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Note that the Beauregard circuit does not have a kx qubit, but without it 
the singly-controlled Fourier additions become doubly-controlled and take 
four times as long. The calculations of the gate count and circuit depth of 
the Beauregard circuit presented here have therefore been done with a kx 
qubit included. 

The next circuit clement firstly adds 2^m?'' mod iff fcx = 1 then sub- 
tracts A''. If 6-1- (2-'m^* mod A'') < subtracting N will result in a negative 
number. In a binary register, this means that the most significant bit will 
be 1. The next circuit element is an inverse QFT which takes the addition 
result out of Fourier space and allows the most significant bit to be accessed 
by the following CNOT. The MS (Most Significant) qubit will now be 1 iff 
the addition result was negative. If 6 -|- {2^nnF' mod A^) > N, subtracting 
N will yield the positive number {h + 2^mF) mod N and the MS qubit will 
remain set to 0. 

We now encounter the first circuit element that would not be present if 
interactions between arbitrary pairs of qubits were possible. Note that while 
this "long swap" operation technically consists of L regular swap gates, it 
only increases the depth of the circuit by 1. The subsequent QFT enables 
the MS controlled Fourier addition of A'^ yielding the positive number (6 -|- 
2^wF) mod N if MS = 1 and leaving the already correct result unchanged 
if MS = 0. 

While it might appear that we arc now done, the qubits MS and kx 
must be reset so they can be reused. The next circuit element subtracts 
2^mF mod N. The result will be positive and hence the most significant bit 
of the result equal to iff the very first addition h + (2^m^' mod N) gave a 
number less than A'^. This corresponds to the MS = 1 case. After another 
inverse QFT to allow the most significant bit of the result to be accessed, the 
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Fig. 8.4: Circuit to compute c — {b + 2^m^ ) mod N. The diagonal circuit elements 
labelled swap represent a series of 2-qubit swap gates. Small gates spaced 
close together represent compound gates. The qubits x(i) are defined in 
Eq. 18.21 and essentially store the current partially calculated value of the 
modular exponentiation that forms the heart of Shor's algorithm. The MS 
(Most Significant) qubit is used to keep track of the sign of the partially 
calculated modular addition result. The ki qubit is the ith bit of k in 
Eq. 18.11 The kx qubit is set to 1 if and only if x{i)j — ki — 1. kx ± 
2^171^ mod N denotes modular addition (subtraction) conditional on kx = 
1. 
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MS qubit is reset by a cnot gate that flips the target qubit iff the control 
qubit is 0. Note that the long swap operation that occurs in the middle of 
all this to move the kx qubit to a more convenient location only increases 
the depth of the circuit by 1. 

After adding back 2^m?'^ mod A^, the next few gates form a Toffoli gate 
that resets kx. The final two swap gates move into position ready 

for the next modular addition. Note that the L and R gates are inverses of 
one another and hence not required if modular additions precede and follow 
the circuit shown. Only one of the final two swap gates contributes to the 
overall depth of the circuit. 

The total gate count of the LNN modular addition circuit is 2L^+8L+22 
and compares very favorably with the general architecture gate count of 

+ 6L + 14. Similarly, the LNN depth is 8L + 16 versus the general depth 
of 8L + 13. 

8.4 Controlled swap 

Performing a controlled swap of two large registers is slightly more difficult 
when only LNN interactions are available. The two registers need to be 
meshed so that pairs of equally significant qubits can be controlled-swapped. 
The mesh circuit is shown in Fig. 18.51 This circuit element would not be 
required in a general architecture. 

After the mesh circuit has been applied, the functional part of the con- 
trolled swap circuit (Fig. 18. 6j) can be applied optimally with the control 
qubit moving from one end of the meshed registers to the other. The mesh 
circuit is then applied in reverse to untangle the two registers. 

The gate count and circuit depth of a mesh circuit is L(L — 1)/2 and L — 1 
respectively. The corresponding equations for a complete LNN controlled 
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Fig. 8.5: Circuit designed to interleave two quantum registers. 




Fig. 8.6: (a) LNN circuit for the controlled swapping of two qubits \a) and |6). The 
qubits \a') and \b') represent the potentially swapped states, (b) LNN 
circuit for the controlled swapping of two quantum registers. Note that 
when chained together, the effective depth of the cswap gate is 4. 
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swap are L^ + SL and 6L. The general controlled swap only requires 6L gates 
and can be implemented in a circuit of depth 4L + 2. The controlled swap is 
the only part of this implementation of Shor's algorithm that is significantly 
more difficult to implement on an LNN architecture. 

8.5 Modular Multiplication 

The ideas behind the modular multiplication circuit of Fig. 18.71 were dis- 
cussed in Section [8.11 The first third comprises a controlled modular mul- 
tiply (via repeated addition) with the result being stored in a temporary 
register. The middle third implements a controlled swap of registers. The 
final third resets the temporary register. 

Note that the main way in which the performance of the LNN circuit 
differs from the ideal general case is due to the inclusion of the two mesh 
circuits. Nearly all of the remaining swaps shown in the circuit do not 
contribute to the overall depth. Note that the two swaps drawn within the 
QFT and inverse QFT are intended to indicate the appending of a swap 
gate to the first and last compound gates in these circuits respectively. 

The total gate count for the LNN modular multiplication circuit is + 
20^2 + 58L - 2 versus the general gate count of 4L^ + ISL^ + 35L + 4. The 
LNN depth is 16L^ + 40L - 7 and the general depth IGL^ + 33L - 6. 

8.6 Complete Circuit 

The complete circuit for Shor's algorithm (Fig. 18. 8() can best be understood 
with reference to Fig. 18. Ih . and the four steps described in Section 18.11 The 
last two steps of Shor's algorithm are a QFT and measurement of the qubits 
involved in the QFT. When a 2-qubit controlled quantum gate is followed by 
measurement of the controlled qubit, it is equivalent to measure the control 
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k| X mod N 



.1^ .1^ If) _x2-2.2.2.2. 
^^^^^-^ ^^^^^^ 



Xgki - 2^{m^')' mod N 



Xgki - 2^{m^')' mod N 
Xiki - 2^(m2')"^mod N 



Xnki - 2°(m2')" mod N 




Xoki + 2°m^ mod N 



1 — I r 



^^^^^-^ ^^^^^^ 



Fig. 8.7: Circuit designed to modularly multiply x{i) by rn^ if and only if ki = 1. 

Note that for simplicity the circuit for L — 4 has been shown. Note that 
the bottom L + 1 qubits are ancilla and as such start and end in the |0(O)) 
state. The swap gates within the two QFT structures represent compound 
gates, ki X m? mod N denotes modular multiplication conditional on 
h = 1. 
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qubit first and then apply a classically controlled gate to the target qubit. 
If this is done to every qubit in Fig. 18.1b . it can be seen that every qubit 
is decoupled. Furthermore, since the QFT is applied to the k register and 
the k register qubits are never interacted with one another, it is possible to 
arrange the circuit such that each qubit in the k register is sequentially used 
to control a modular multiplication, QFTed, then measured. Even better, 
after the first quit of the k register if manipulated in this manner, it can be 
reset and used as the second qubit of the k register. This one qubit trick 
|194j forms the basis of Fig. 18.81 

The total number of gates required in the LNN and general cases are 
8L^+iOL^ + 116iL2 +4iL - 2 and SL'^ + 26L^ + 70^1"^ +8^L-1 respectively. 
The circuit depths are 32L^ + SOL^ - 4L - 2 and 32L^ + 66^^ - 2L - 1 
respectively. The primary result of this chapter is that the gate count and 
depth equations for both architectures are identical to leading order. 

8. 7 Conclusion 

We have presented a circuit implementing Shor's algorithm in a manner 
appropriate for a linear nearest neighbor qubit array and studied the number 
of extra gates and consequent increase in circuit depth such a design entails. 
To leading order our circuit involves 8L^ gates arranged in a circuit of depth 
32L^ on 2L + 4 qubits — fi gures identical to that possible when interactions 
between arbitrary pairs of qubits are allowed. Given the importance of 
Shor's algorithm, this result supports the widespread experimental study of 
linear nearest neighbor architectures. 

Simulations of the robustness of the circuit when subjected to random 
discrete errors have been completed |195j . showing extreme sensitivity to 
even small numbers of errors. Future simulations will investigate the per- 
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Fig. 8.8: Circuit implementing the quantum part of Shor's algorithm. The single- 
qubit gates interleaved between the modular multiplications comprise a 
QFT that has been decomposed by using measurement gates to remove 
the need for controlled quantum phase rotations. Note that without these 
single-qubit gates the remaining circuit is simply modular exponentiation. 
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formance of the circuit when protected by LNN quantum error correction. 



9. Shor's algorithm with a hmited set of 

rotation gates 



Every circuit implementation of Shor's algorithm (see Table 17. If) ideally 
calls for controlled rotation gates of magnitude 7r/2^^ where L is the binary 
length of the integer to be factored. Such exponentially small rotations are 
physically impossible to implement for large L. Prior work by Coppersmith 
focusing solely on a quantum Fourier transform suggested that it would be 
sufficient to implement controlled vr/lO^ rotations if integers thousands of 
bits long were desired factored |192j . In this chapter, we study in detail 
the complete Shor's algorithm using only controlled 7r/2'^ rotation gates 
with d less than or equal to some dmax- It is found that integers up to 
length -Lmax = 0(4'^™=''') can be factored without significant performance 
penalty. Consequently, we are able to show that controlled rotation gates of 
magnitude 7r/64 are sufficient to factor integers thousands of bits long. 

The reader is assumed to be familiar with the description of Shor's algo- 
rithm and notation as outlined in Chapter d In Section [9. 11 Coppersmith's 
approximate quantum Fourier transform is introduced. In Section 19.21 we 
investigate the relationship between the period r of the function f{k) given 
as input to the quantum part of Shor's algorithm, and the probability s of 
obtaining useful output. In Section ESI we study the relationship between 
s and both the length L of the integer being factored and the minimum 
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Fig. 9.1: Circuit for a 4-qubit (a) quantum Fourier transform and (b) approximate 
quantum Fourier transform with dmax = 1- 

angle controlled rotation 7r/2'^™'"'. This is then used to relate Lmax to dmax- 
Section in31 contains a summary of results. 



9.1 Approximate quantum Fourier transform 



Provided any circuit from Table 17.11 other than Beauregard's is used to 
implement Shor's algorithm, exponentially small rotations only occur in the 
one and only quantum Fourier transform required just before measurement. 
The standard QFT circuit is shown in Fig. 19.1b . Note the use of controlled 
rotations of magnitude it/2'^. In matrix notation these 2-qubit operations 
correspond to 

1 
10 
1 
\^ e*^/2^ J 



\ 



(9.1) 



Coppersmith's approximate QFT (AQFT) circuit |192j is very similar 
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with just the deletion of rotation gates with d greater than some dmax- For 
example, Fig. 19.1b shows an AQFT with dmax = 1- Let [j]m denote the mth 
bit of j. The the action of the AQFT on a computational basis state \k) is 



1 2 /2TTi-^ 



E lj>xp(^E^„b1„^W„2-+") (9.2) 



^ i=o 



where X^mn denotes a sum over all m, n such that < m,n < 2L and 
2L — dmax + 1 ^ m + n < 2L. It has been shown by Coppersmith that 
the AQFT is a good approximation of the QFT |192j in the sense that the 
phase of individual computational basis states in the output of the AQFT 
differ in angle from those in the output of the QFT by at most 27rL/2'^™'"'. 
The purpose of this chapter is to investigate in detail the effect of using the 
AQFT in Shor's algorithm. 

9.2 Dependence of output reliability on period of f{k) = mod 

Different values of r (the period of f{k) = mod N) imply different prob- 
abilities s that the value j measured at the end of QPF will be useful. In 
particular, as discussed in Chapter [71 if r is a power of 2 the probability of 
useful output is much higher (Fig. 17.2(1 . This section investigates how sensi- 
tive s is to variations in r. Recall Eq. (|7.7|) for the probability of measuring 
a given value of j. When the AQFT of Eq. (|9.2|) is used this becomes 



22L ^^^\22L 
p=0 



Pr(j, r,L,(imax) 
The probability s of useful output is thus 



- E exp(|;i:„.„b1mbr]n2™+" 



(9.3) 



s(r, L, dmax) = E Pr(j,r, i,dmax) 

{useful j} 



(9.4) 
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where {useful j} denotes all j = [c2^-^/rJ or [c2^^/r] such that < c < r. 
Fig. l9.3l shows s for r ranging from 2 to 2-^ — 1 and for various values of L and 
draax- The decrease in s for small values of r is more a result of the definition 
of {useful j} than an indication of poor data. When r is small there are 
few useful values of j ~ c2^^/r, < c < r and a large range states likely 
to be observed around each one resulting superficially in a low probability 
of useful output s as s is the sum of the probabilities of observing only 
values j = [c2^^/rj or [c2^-^/r], < c < r. However, in practice values 
much further from j ~ c2^'^/r can be used to obtain useful output. For 
example if r = 4 and j = 16400 the correct output value (4) can still be 
determined from the continued fraction expansion of 16400/65536 which is 
far from the ideal case of 16384/65536. To simplify subsequent analysis each 
pair (L, d^anx) will from now on be associated with 5(2^"-*^ + 2, L, dmax) which 
corresponds to the minimum value of s to the right of the central peak. The 
choice of this point as a meaningful characterization of the entire graph is 
justified by the discussion above. 



For completeness, Fig. 19.3b shows the case of noisy controlled rotation 
gates of the form 



(9.5) 



f 1 ^ 

10 

1 

\ e*('^/2'+'5) ) 

where (5 is a normally distributed random variable of standard deviation a. 
This has been included to simulate the effect of using approximate rotation 
gates built out of a finite number of fault-tolerant gates. The general form 
and probability of successful output can be seen to be similar despite a = 
7r/32. This a corresponds to 7i-/2'^max+2_ -p^j, ^ controlled 7r/64 rotation, 
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Fig. 9.2: Decomposition of a controlled phase gate into single-qubit rotations and 
a CNOT gate. 



single-qubit rotations of angle 7r/128 are required, as shown in Fig. 19.21 
Fig. \\).'Ak implies that it is acceptable for these rotations to be implemented 
within 7r/512, implying 



U 



1 

Q gi(7r/128+7r/512) 



(9.6) 



is an acceptable approximation of i?i28- This point will be developed further 
in Chapter 1^ 



9.3 Dependence of output usefulness on integer length and rotation 

gate set 

In order to determine how the probability of useful output s depends on both 
the integer length L and the minimum allowed controlled rotation 
Eq. H9.4|) was solved with r = 2^~^ + 2 as discussed in Section lOl Fig. 19.41 
contains semilog plots of s versus L for different values of dmax- Note that 
Eq. (|9.4|) grows exponentially more difficult to evaluate as L increases. 

For dmax from to 5, the exponential decrease of s with increasing L is 
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Fig. 9.3: Probability s of obtaining useful output from quantum period finding as 
a function of period r for different integer lengths L and rotation gate 
restrictions 7r/2''"""'. The effect of using inaccurate controlled rotation 
gates (cr = 7r/32) is shown in (e). 
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Fig. 9.4: Dependence of the probability of useful output from the quantum part of 
Shor's algorithm on the length L of the integer being factored for different 

levels of restriction of controlled rotation gates of angle 7r/2''"''". The 
parameter Lq characterizes lines of best fit of the form s oc 2~^^^°. 
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clear. Asymptotic lines of best fit of the form 

s oc 2-^/^° (9.7) 

have been shown. Note that for dmax > 0, the value of Lq increases by 
greater than a factor of 4 when dmax increases by 1. This enables one to 
generalize Eq. 1)9. 7|) to an asymptotic lower bound valid for all dmax > 

s oc 2-^/4"'"^^"' (9.8) 

with the constant of proportionality approximately equal to 1. 

Keeping in mind that the required number of repetitions of QPF is 
0(1 /s), one can relate -Lmax to dmax by introducing an additional parameter 
/max characterizing the acceptable number of repetitions of QPF 

W^4'^-"^''~^log2/max. (9.9) 

Available RSA encryption programs such as PGP typically use inte- 
gers of length L up to 4096. The circuit in |186j runs in 150L'^ steps when an 
architecture that can interact arbitrary pairs of qubits in parallel is assumed 
and fault-tolerant gates are used. By virtue of the fact that this circuit only 
interacts a few qubits at a time leaving the rest idle, error correction can 
be easily built in without increasing the circuit depth. Thus for L = 4096, 
^■10^^ steps are required to perform a single run of QPF. On an electron spin 
or charge quantum computer |1961 199j running at lOGHz this corresponds 
to ~15 minutes of computing. If we assume ~24 hours of computing is ac- 
ceptable then /max ~ 10^. Substituting these values of Lmax and /max into 
Eq. (|9.9j) gives dmax = 6 after rounding up. Thus provided controlled 7r/64 
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rotations can be implemented accurately, implying the need to accurately 
implement 7r/128 single-qubit rotations, it is conceivable that a quantum 
computer could one day be used to break a 4096-bit RSA encryption in a 
single day. With additional qubits, this time could be reduced by several 
orders of magnitude by using one of the circuits described in Ref. jl85j . 

9.4 Conclusion 

We have demonstrated the robustness of Shor's algorithm when a limited set 
of rotation gates is used. The length L^ax of the longest factorable integer 
can be related to the maximum acceptable runs of quantum period find- 
ing /max and the smallest accurately implementable controlled rotation gate 
y^y'2rfmax yjg^ Lmax ~ ^Qg^ /max- Integers thousands of digits in length 

can be factored provided controlled 7r/64 rotations can be implemented with 
rotation angle accurate to 7r/256, corresponding to single-qubit 7r/128 ro- 
tations implemented within 7r/512. Sufficiently accurate fault-tolerant ap- 
proximations of such single-qubit rotation gates are presented in Chapter llUl 



108 9. Shor's algorithm with a limited set of rotation gates 



10. Constructing arbitrary single-qubit 
fault-tolerant gates 



In large-scale quantum computation, every qubit of data is encoded across 
multiple physical qubits to form a logical qubit permitting quantum error 
correction and fault-tolerant computation. Unfortunately, only very small 
sets of fault-tolerant gates 9 can be applied simply and exactly to logical 
qubits, where 9 depends on the number of logical qubits considered, the 
code used, and the level of complexity one is prepared to tolerate when im- 
plementing fault-tolerant gates. Gates outside 9 must be approximated with 
sequences of gates in 9- The existence of efficient approximating sequences 
has been established by the Solovay-Kitaev theorem and subsequent work 
|119l 112(11 13U1 [T^ . In this chapter, we describe a numerical procedure tak- 
ing a universal gate set 9, gate U, and integer I and outputting an optimal 
approximation of U using at most / gates from 9- This procedure is used to 
explore the properties of approximations of the single-qubit phase rotation 
gates built out of fault-tolerant gates that can be applied to a single Steane 
code logical qubit. The average rate of convergence of Steane code fault- 
tolerant approximations to arbitrary single-qubit gates is also obtained. 

Section describes the basics of the numerical procedure used to find 
optimal gate sequences approximating a given gate. A universal set of 24 
gates that can be applied fault-tolerantly to a single Steane code logical qubit 
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is given in Section riU.21 along with most of their quantum circuits. The com- 
phcated circuits comprising the T-gate, which is part of this universal set, 
are described separately in Section flO., 31 Section rin.4l contains a discussion 
of single-qubit phase rotations and their fault-tolerant approximations, fol- 
lowed by approximations of arbitrary gates in Section I1U.5I Section 111). 61 
summarizes the results of this chapter and their implications, and points to 
further work. 



Let U{m) denote the m-dimensional unitary group. In this section, we 
outline a numerical procedure that takes a finite gate set S C U{m) that 
generates U{m), a gate U G U{m), and an integer / and outputs an optimal 
sequence Ui of at most / gates from S minimizing the metric 



The rationale of Eq. (|10.1|) is that if U and Ui are similar, U'^Ui will be close 
to the identity matrix (possibly up to some global phase) and the absolute 
value of the trace will be close to m. By subtracting this absolute value from 
m and dividing by m a number between and 1 is obtained. The overall 
square root is required to ensure that the triangle inequality 



is satisfied. This metric has been used in preference to the trace distance 
used in the Solovay-Kitaev theorem jl2()| LSOj . as the trace distance does 
not ignore global phase, and hence leads to unnecessarily long phase correct 
approximating sequences. 



10.1 Finding optimal approximations 




(10.1) 



dist(C/, W) < dist([/, V) + dist(y, W) 
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Finding optimal gate sequences is a difficult task, and the run-time of 
the numerical procedure presented here scales exponentially with /. Never- 
theless, as we shall see in Section 110. 4( gate sequences of sufficient length 
for practical purposes can be obtained. 

For a set S of size 5 = |S| and a maximum sequence length of /, the size of 
the set of all possible gate sequences of length up to / is approximately gK For 
even moderate g and /, this set cannot be searched exhaustively. To describe 
the basics of the actual method used, a few more definitions are required. 
Let G denote a gate in S. Order S, and denote the ith gate by Gj. Let S 
denote a sequence of gates in S- Order the possible gate sequences in the 
obvious manner Gi, . . . , G^, GiGi, . . . , GiGg, G2G1, . . ., and let S'„, denote 
the nth sequence in this ordering. Let {S}i denote all sequences with length 
less than or equal to /. Let {Q}iiJ' < I denote the set of unique sequences 
of length at most /'. Naively, {Q}i' can be constructed by starting with the 
set containing the identity matrix, sequentially testing whether Sn G 
satisfies dist(S'„, Q) > for all Q G {Q}i', and adding Sn to {Q}/' if it does. 
A search for an optimal approximation of U using gates in S begins with 
the construction of a very large set of unique sequences {Q}i'. 

The utility of {Q}/' li^s in its ability to predict which sequences in 
{S}i, I > I' do not need to be compared with U to determine whether they 
are good approximations, and what the next sequence worth comparing is. 
To be more precise, assume every sequence up to Sn-i has been compared 
with U. Let {-Sn-i} denote this set of compared sequences. Consider sub- 
sequences of Sn of length If any subsequence is not in {Q}i', there exists 
a sequence in {Sn~i} equivalent to Sn- In other words, a sequence equiv- 
alent to Sn has already been compared with U, and Sn can be skipped. 
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Furthermore, let 

Sn = Gi^ . . . Gi^^^i^^Gi^^^, . . . Gi^_^^Gi^ ■■■Gi^, (10.3) 

where Gi^^^, . . . Gi^^-^ is the subsequence not in {Q}i'- Let Q{Gi^^^, . . . Gi^^^) 
denote the next sequence in {Q}/' after Gj^.^,, • • • Gi^^^. The next sequence 
with the potential to not be equivalent to a sequence in {Sn-i} is 

G^^.■■ • . . . . . Gi. (10.4) 

The process of checking subsequences is then repeated on this new sequence. 
Skipping sequences in this manner is vastly better than an exhaustive search, 
and enables optimal sequences of interesting length to be obtained. It should 
be stressed, however, that the runtime is still exponentially in /. 

Highly non-optimal but polynomial runtime sequence finding techniques 
do exist [1201 ini3 Gnu HHBl but wiU not be discussed here. 



10.2 Simple Steane code single-qubit gates 

For the remainder of the chapter we will restrict our attention to fault- 
tolerant single-qubit gates that can be applied to the 7-qubit Steane code. 
The Steane code representation of states |0) and |1) is jll.Sj 

|0i) = ^(lOOOOOOO) IIOIOIOI) lOllOOll) + IllOOllO) 

-F|0001111) + llOllOlO) + lOllllOO) + IllOlOOl)), (10.5) 

|1l) = ^(llllllll) -h loioiolo) + llOOllOO) lOOllOOl) 

-F|1110000) + lOlOOlOl) + llOOOOll) + lOOlOllO)). (10.6) 



10.2. Simple Steane code single-qubit gates 



113 



An equivalent description of this code can be given in terms of stabiliz- 
ers |117j which are operators that map the logical states \0l) and to 
themselves. 

IIIXXXX (10.7) 

IXXIIXX (10.8) 

XIXIXIX (10.9) 

IIIZZZZ (10.10) 

IZZIIZZ (10.11) 

ZIZIZIZ (10.12) 

States \0l) and are the only two that are simultaneously stabilized by 
Eqs (|10.7WT0.12j) . Non- fault-tolerant circuits for both a general and LNN 
architecture that take an arbitrary state q|0) -|- /3|1) and produce ajOi) -|- 
are shown in Fig. 110. ll The fault-tolerant preparation of logical states 
is more complicated, and will be discussed in the context of T-gate ancilla 
state preparation in Section \l().'Ai 

The minimal universal set of single-qubit fault-tolerant gates that can 
be applied to a Steane code logical qubit consists of just the Hadamard gate 
and the T-gate 

T = I I . (10.13) 




For practical purposes, the gates X, Z, S, 5^ should be added to this set, 
where 



/ 



S 



1 

i 



(10.14) 

along with all gates generated by H, X, Z, S, . The complete list of 
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Fig. J 0.1: Non-fault-tolerant 7-qubit Steane code encoding circuits taking an arbi- 
trary state a\0) +/3|1) and producing alOi) -|- /3|1l). (a) Depth 4 circuit 
for an architecture able to interact arbitrary pairs of qubits. (b) Depth 
5 circuit for a linear nearest neighbor architecture. 



10.2. Simple Steane code single-qubit gates 



115 



gates that we shall consider is shown in Eq. 1)10. 15(1 . This is our set S. Note 
that gates {I, Gi, . . . , G23} form a group under multiplication. Appendix 1X1 
contains the multiplication table of this group. 



Gi 


= H 


Gi3 


= HS 


G2 


= X 


Gu 


= HS'f 


Gz 


= Z 




= ZXH 


G/i 


= S 


G16 


= SXH 


G5 


= 5t 


Gi7 


= S^XH 


Ge 


= XH 


G18 


= HSH 


G7 


= ZH 


Gig 


= HS^H 


Gs 


= SH 


G20 


= HSX 


G9 


= S^H 


G21 


= HS^X 


Gio 


= ZX 


G22 


= S^HS 


Gu 


= SX 


G23 


= SHS^ 


G12 


= S^X 


G24 


= T 



(10.15) 



To justify the use of such a large set S, consider the transversal circuits 
shown in Fig. Ilfl.2l implementing H, X, Z, S and . By combination, it 
can be seen that gates {Gq, . . . , G23} can also be implemented with simple 
transversal applications of single qubit gates. As we shall see in Section llO.31 
by comparison the T-gate is extremely complicated to implement. Since we 
are interested in minimal complexity as well as minimum length sequences of 
gates in S, it would be unreasonable to count G23 as three gates when in re- 
ality it can be implemented as easily as any other gate {Gi, . . . , 6*22}- Since 
{I ,Gi, . . . , G23} is a group under multiplication, minimum length sequences 
of gates approximating some U outside 9 will alternate between an element 
of {Gi, . . . , G23} and a T-gate. Note that the T^-gate is not required in S for 



116 10. Constructing arbitrary single-qubit fault-tolerant gates 



universality or efficiency as, in gate sequences of length Z > 2, it is equally 
efficient to use S^T or TS^ . The extra gate is absorbed into neighboring 
Gj-gates, i < 24. 

10.3 The fault-tolerant T-gate 

Moving on to implementing the fault-tolerant T-gate |3r)| . the basic idea is 
to prepare an ancilla state |0l) -|- e^'^/^ll^) then apply the circuit shown in 
Fig. I1U.31 Tracing the action of Fig. I1U.31 we initially have 

(|0L) + e*-/^|lL))(a|0L)+/3|lL». (10.16) 
After applying the cnot we obtain 

a|OL)|OL) + /3|0l)|1l) + 06^-/^11^)11^) + Pc'^/^IWIOl) 
= (alOi) +/3e^-/4|U))|0L) + (/3|0l) +ae^-/4|U))|lL). 

After measuring the lower logical qubit, if \0l) is observed (meaning one of 
the eight bit strings shown in Eq. (|l().5j) or a bit string a single bit different 
from one of these eight), no further action is required. If is observed, 
applying the logical gate SX to the top qubit will yield the desired state 
up to an irrelevant global phase. Note that the measurement step and 
subsequent classical processing allows the correction of a single bit-flip error 
and is insensitive to phase errors. 

To fault-tolerantly prepare the ancilla state, we first need to be able to 
fault-tolerantly prepare the state \0l)- As we shall see, to do this, we need to 
be able to fault-tolerantly determine whether a state |^') is in the -|-1 or — 1 
eigenstate of a self-inverse operator A (A^ = I). A non-fault-tolerant circuit 
doing this is shown in Fig. 110.41 It is instructive to trace the action of the 
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Fig. 10.2: Circuits fault-tolerantly applying common single-qubit gates to Steane 
code logical qubits. Gates in brackets are optional as they implement 
stabilizers. 
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Fig. 10.3: High-level representation of the circuit implementing the T-gate on a 
Steane code logical qubit. Input a|0L)+/3|lL) is transformed into a|OL)-f 
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Fig. 10.4: Circuit measuring whether \'^) is in the +1 or —1 eigenstate of A. 

circuit. The initial state is |0)|^'), which after the first Hadamard becomes 
(|0) + After the controlled-^ the state becomes |0)|^') + 

After the second Hadamard 



(|0) + |1))|^) + (|0)-|1)M|M/) 

(lU.loj 

|0)(|^)+A|^)) + |l)(|^)-yl|vI/)). 



If a zero is measured, the lower qubit will be in the +1 eigenstate 
Conversely, if one is measured, the lower qubit will be in the —1 eigenstate 

The specific self-inverse operators we wish to measure are the stabilizers 
Eqs HlO.ZffTir^ . To build a fault-tolerant circuit measuring these multiple 
qubit operators, the control qubit shown in Fig. 110.41 must be replaced by 
a cat state so that each qubit modified by the stabilizer is controlled by 
a different qubit in the cat state. This is necessary to prevent a single 
error in a control qubit propagating to multiple target qubits. This in turn 
necessitates fault-tolerant cat state preparation which is shown in Fig. 110. 5h , 
|199j . A single bit- or phase- flip anywhere in this circuit causes at most 
one error in the final state. This circuit is significantly simpler, and no less 
robust than the fault-tolerant cat state preparation circuit suggested in jSUl 
(Fig. 110.6(1 . The uncat circuit of Fig. llU.5b is fault-tolerant purely because 
its output is a single qubit and by definition a single error can cause of most 
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if M=1, restart circuit 



(a) ^ 



R - H 



CD CD [M}^ Success 



IT 



(|0000>+|1111» 



o 

03 



(b) 



- r 




H - 


- r 













o 

03 



Fig. 10.5: (a) Simple circuit fault-tolerantly preparing a cat state, (b) Circuit un- 
doing the preparation of a cat state. 



one error in the output. 

Using the circuit notation shown in Fig. 110.71 the complete circuit for 
fault-tolerantly measuring a stabilizer is shown in Fig. Ilfl.8l Note that the 
basic stabilizer measurement circuit appears three times since a single error 
in a cat state block, while not propagating to multiple qubits in the logical 
state block, almost always causes an incorrect measurement. To ensure a 
probability O(p^) of incorrect measurement, the process must be repeated 
up to three times. The third measurement structure can be omitted if the 
first two measurements are the same. The final triply controlled Z-gate is 
only applied if the majority of the measurements are one. Note that this 
assumes fast and reliable classical processing is available. The final Z-gate 
converts a —1 eigenstate of XIXIXIX into a eigenstate. Thus the 
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if any M=1, restart circuit 
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Fig. 10.6: Typical, but unnecessarily complicated circuit fault-tolerantly preparing 
a cat state. 

output of Fig. I1U.8I is the +1 eigenstate of XIXIXIX with probabihty 
0{p^) of failure (i.e. more than one incorrect output qubit). 

We now have the necessary tools to fault-tolerantly prepare |0i^). Recall 
that \0l) and are the only two states simultaneously stabilized by all 
of Eqs ()lU.7fnirT^ . If we include the logical Z operator, |0l) is the unique 
state stabilized by Z and all six stabilizers. The state \0l) could thus be 
created using the circuit of Fig. 110. 9l which outputs \0l) for arbitrary input 

isni. 

A better way of obtaining \0l), is to start with the state |0000000) which 
is physically accessible in a quantum computer architecture either via some 
form of special reset operation, or measurement possibly followed by an X- 
gate. State 1 0000000) is a +1 eigenstate of logical Z and Eqs (IIO.IOHIO.II^ . 
therefore only stabilizers Eqs (|l().7tnTr!l|) need to be measured (Fig. Il().l()|) . 

To complete the construction of the ancilla state, and hence the T-gate 
(Fig. 110. 11() . the operator e*'^/^X is measured. Note that e^^'^X is not 
self-inverse, but nevertheless the circuit works as required. Specifically, be- 
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Fig. 10.7: Symbolic representation of transversal controlled operations. 



o 



& = -[ 



Fig. 10.8: Circuit fault-tolerantly projecting |^) onto the ±1 eigenstates of 
XIXIXIX, then converting —1 eigenstate's into +1 eigenstates. The third 
measurement structure can be omitted if Mi ~ M2. 
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Fig. 10.9: Circuit taking arbitrary input and producing |0l) by repeated stabilizer 
measurement. 




Fig. 10.10: Circuit taking arbitrary input and producing |0l) via physical resetting 
and just three stabilizer measurements. 
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fore cat state preparation we have |000)|0l). After cat state creation this 
becomes 

-^(|000) + |111))|0l). (10.19) 
After transversal cnot we have 

-^(|000)|0l) + |111)|1l)). (10.20) 

Note that only three physical CNOT gates are required to implement a logical 
CNOT gate on the Steane code due to its stabilizer structure. After the 
single-qubit T-gate we have 



i=(|000)|Oz.)+e-/^|lll)|U)). (10.21) 



After uncat we have 



|0)-^(|0i) +e-/^|U)) + |l)-L(|Oi) -e-/^|U)), (10.22) 

resulting in the state (|0l) + e'''/^|lL))/\/2 if zero is measured, and (lO^,) — 
e^'^/^\li))/y/2 if one is measured. Note that the steps shown in Eqs 1)10.191 - 
110. 22|) must be repeated up to three times to be able to say that a or 
1 has been measured with probability of error O(p^). The final logical 5- 
gate converts (|0l) - e'''/'^\lL))/V2 into (|0l) + e^''/^|lL))/^/2. Under the 
assumptions that 2-qubit gates, measurement, reset and classical processing 
each have depth 1, single-qubit gates have depth zero and do not contribute 
to the gate count, and arbitrary disjoint 2-qubit gates can be implemented 
in parallel. Table ITim summaries the best case complexity of the T-gate. 
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Circuit Element 


Count 


Qubits 


19 


Gates 


93 


Resets 


45 


Measurements 


17 


Depth 


92 



Tab. 10.1: Best case complexity of the T-gate. 




Fig. 10.11: Complete circuit implementing the T-gate on a Steane code logical 
qubit. 
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10.4 Approximations of phase gates 



We now use the machinery described in this chapter to construct optimal 
fault-tolerant approximations of single-qubit phase rotation gates 



Gates i?2'* examples of gates used in the single-qubit quantum Fourier 
transform that forms part of the Shor circuits described in Chapters ISHHl 
Note that phase rotations of angle 27rx/2'^, where x is a d-digit binary num- 
ber, are also required, but the properties of fault-tolerant approximations of 
such gates can be inferred from R2d. 

For a given i?2d, and maximum number of gates / in S, Fig. 110.121 shows 
dist(i?2d, [//) where Ui is an optimal sequence of at most / gates in 9 mini- 
mizing dist(i?2d, [//). For d > 3, Ui is equivalent to the identity. Note that 
as d increases, i?2d becomes closer and closer to the identity, lowering the 
value of dist(i?2d) f^i)) and increasing the value of I required to obtain an 
approximation Ui that is closer to i?2<' than the identity. In fact, for Ri28 
the shortest sequence of gates that provides a better approximation of i?i28 
than the identity has length / = 31. There are a very large number of opti- 
mal sequences of this length. An example of one with a minimal number of 
T-gates is 



The parentheses group standard gates into elements of the set 9 defined in 



Eq. (110.151) . Note that dist(i?i28, ^) = 8.7 x 10"^ whereas dist(i?i28, U33) = 




(10.23) 



HTHT{SH)T{SH)T{SH)THTHT{SH) 
THTHT{SH)THTHTHT{SH)T{Sm) 



(10.24) 
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Fig. 10.12: Optimal fault-tolerant approximations Ui of phase rotation gates R2d, 

for i?8 to i?i28- 



8.1 X 10~^ . In other words Eq. (|1U.24() is only slightly better than the identity. 
This immediately raises the question of how many gates are required to 
construct a sufficiently good approximation. 



In Chapter ini it was shown that 



U 



1 

Q gi('r/128+7r/512) 



(10.25) 
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was sufficiently close to i?i28- This is, of course, only a property of Shor's al- 
gorithm, not a universal property of quantum circuits. Given dist(i?i28, U) = 
2.2 X 10^'^, a sufficiently accurate fault-tolerant approximation Ui of i?i28 
must therefore satisfy dist(-Ri28, Ui) < 2.2 x 10^^. The smallest value of / 
for which this is true is 46, and one of the many optimal gate sequences 
satisfying dist(i?i28, Kie) = 7.5 x 10^^ is 

[/46 = HTHTHT{SH)THT{SH)T{SH)T{SH)THT 

{SH)T{SH)THTHT{SH)T{SH)THT{SH)T (10.26) 
{SH)T{SH)THT{SH)THT{HS^)T 

Parentheses again group standard gates into elements of S. Now that we 
have a minimal complexity circuit sufficiently close to i?i28) the immediate 
question is whether it is practical. An alternative to Eq. (|l().26j) is shown 
in Fig. llU.l'51 which simply decodes the logical qubit, applies -R128) then re- 
encodes. This simple non-fault-tolerant circuit will fail (generate more than 
one error in the output logical qubit) if a single error occurs almost anywhere 
in the top six qubits. Given there are 11 x 6 = 66 possible error locations, 
the probability of no errors in the top six qubits is (1 — p)^^ . This is the 
worst-case reliability of the circuit. 

A partial schematic of the circuit corresponding to Eq. I1U.26I is shown 
in Fig. 110.141 As the circuit is fault-tolerant, it only fails if at least two 
errors occur within the circuit. Any analysis of the reliability of the circuit 
is complicated by the fact that the T-gates that comprise the bulk of the 
circuit have error correction built in at a number of places. Furthermore, 
when errors are detected and corrected, the circuit typically increases in 
depth. Referring back to Fig. llU.lll we shall assume that the T-gate is only 
sensitive to errors in the lower 14 qubits and that the depth of the circuit 
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Fig. 10.13: Non-fault-tolerant circuit exactly implementing i?i28 by first decoding 
the logical qubit and re-cncoding after application of i?i28- 




Fig. 10.14: Schematic of a minimum complexity, sufficiently accurate fault-tolerant 
approximation of i?i28 given in full by Eq. pu.26|l . 



is never increased by errors. From Table llU.ll the best case depth of the 
T-gate is 92. This implies an area sensitive to errors of approximately 1300. 
Given there are 23 T-gates in Fig. 110.141 the total area sensitive to errors 
is approximately 30000. The reliability of Fig. llO.TH is thus approximately 
(1 — p)30000 _|_ 30000p(l — p)29999^ which is only greater than the non-fault- 
tolerant circuit for p < 1.4 x 10^^. 

A fault-tolerant circuit correcting an arbitrary single error in a Steane 
logical qubit is shown in Fig. 110.151 Consider the first half of the circuit. 
Given eigenvalue measurements Ei, E2, E^, the appropriate qubit to correct 
is shown in Table [T?r2l Note that slightly less complex circuits exist that use 
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Fig. 10.15: Circuit fault-tolerantly correcting an arbitrary single error within the 
logical qubit. The qubit acted on by the correct Z/X boxes is described 
by Table [ma 

more qubits jl26j , but our choice of circuit will not play a significant role in 
the following analysis. By applying error correction to the logical qubit after 
each T-gate in Fig. Ilfl.l4[ the reliability of the circuit can be increased. The 
best case depth of Fig. llU.l^l is 120, and if we assume that the circuit is only 
sensitive to errors in the lower seven qubits, the total area sensitive to errors 
is approximately 800. The error corrected U4Q circuit will only fail if two 
errors occur within a single T-gate and error correction block. The reliability 
of a single block is (1 — p)2ioo _j_ 2100p(l — p)'^^^'^ . The failure probability of 
the non- fault-tolerant circuit, the fault-tolerant circuit without correction, 
with correction after every second T-gate, and with correction after every 
T-gate is compared in Fig. 110.161 

Of the fault-tolerant circuits, the one with error correction after every 
T-gate performs best. Nevertheless, this circuit is still only more reliable 
than the non- fault-tolerant circuit for p < 1.3 x 10^^. Given that p ~ 10^^ 
is likely to be very difficult to achieve in practice, longer error correction 
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Tab. 10.2: Qubit to correct given certain sequence of eigenvalue measurements in 
both halves of Fig. 110.151 




Fig. 10.16: Approximate probability of more than one error in the output logical 
qubit versus probability per qubit per time step of discrete error for 
different circuits implementing a i?i28 phase rotation gate, (a) NFT: 
non-fault-tolerant circuit from Fig. 110. l'!^ (b) FT I: fault-tolerant circuit 
from Fig. 110. l"^ (c) FT II: as above but with Fig. 110. 151 error correction 
after every second T-gate, (d) FT III: as above but with error correction 
after every T-gate. Note that all fault-tolerant results are for the 7-qubit 
Steane code without concatenation. 
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code words or concatenation would be required to make the fault-tolerant 
circuit practical jl27j . Given that Fig. IIU.IH is both extremely complex 
and the simplest fault-tolerant circuit sufficiently close to R128 , for practical 
computation non-fault-tolerant circuits similar to Fig. I1U.13I are likely to 
remain the best way to implement arbitrary rotations for the foreseeable 
future. 

In Shor's algorithm, the use of non-fault-tolerant rotations would be 
acceptable as only 2L such gates are used to factorize an L-bit number 
N. Furthermore, only half of Fig. 110. ll-?! would be required as these gates 
immediately precede measurement, and there is no point re-encoding before 
measurement. In a 4096 bit factorization, the total area of non- fault-tolerant 
circuit would be approximately 2 x 10^. Assuming the rest of the Shor 
circuit uses sufficient error correction to be reliable, if p ~ 10~^, the average 
number of errors in the non-fault-tolerant part of the circuit would be two 
— completely manageable with just a few repetitions of the entire circuit or 
minimal classical processing. 



In this section, we investigate the properties of fault-tolerant approximations 
of arbitrary single-qubit gates 



Consider Fig. 110.171 This was constructed using 1000 random matrices U 
of the form Eo. 110.271 with a,l3,6 uniformly distributed in [0, 27r). Optimal 
fault-tolerant approximations Ui were constructed of each, with the average 
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(10.27) 
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dist(C/, Ui) plotted for each /. The indicated hne of best fit has the form 



This equation characterizes the average number I of Steane code single-qubit 
fault-tolerant gates required to obtain a fault-tolerant approximation Ui of 
an arbitrary single-qubit gate U to within 6 = dist(C/, Ui). 

An important point to note is that even with unlimited resources Eq. (|10.28|) 
does not provide a pathway to construct arbitrarily accurate gates. The 
accuracy of the fault-tolerant T-gate described in Section depends crit- 
ically on the accuracy of a single physical T-gate (Eq. ()1U.21|) ). Any over or 
under rotation at this point will be directly reflected in the output state of 
the logical qubit. Since half the gates in an optimal fault-tolerant approx- 
imation are T-gates, as the number of gates increases rotation errors will 
inevitably accumulate. 

Consider the over-rotation gate 



where 61 <C 1. For sufficiently small 6, 6 = dist(/,l0) = ^/3/89. Note that 

any metric on U (2) modulo global phase must have the property 6 (x 
and hence these results are not expected to depend on the precise metric 
used. In the logical T-gate, even if there is systematic over rotation in the 
single physical T-gate, the stochastic nature of Eq. (|10.22|) ensures that the 
final logical state will be out by a random angle ±9. This implies that a 
fault-tolerant approximation involving 1/2 T-gates will be uncertain by an 



amount 6 = ^31/169. The inequahty ^31/169 < 0.292 x lO^O-OSH' therefore 



S = 0.292 X 10 



,-0.0511i 



(10.28) 




(10.29) 
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Approximations Uj of random gates R 
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Number of gates 1 



Fig. 10.17: Average accuracy of optimal fault-tolerant gate sequence approxima- 
tions of length /. 



sets the maximum number of gates that can meaningfully be included in a 
fault-tolerant approximation. Note that even if ~ 10~^, Zmax is only 60 - 
a number of gates accessible using the algorithm described in this chapter. 



10.6 Conclusion 

We have described an algorithm enabling the optimal approximation of ar- 
bitrary unitary matrices given a discrete universal gate set. We have used 
this algorithm to investigate the properties of fault-tolerant approximations 
of arbitrary singlc-qubit gates using the gates that can be applied to a single 
Steane code logical qubit and found that on average an / gate approxima- 
tion can be found within 6 = 0.292 x iQ-^-O^ni ^j^g ideal gate. We have 
considered the specific case of the phase rotation gates used in Shor's al- 
gorithm and found that even the minimal complexity fault-tolerant circuits 
obtained are still so large that they are outperformed by non-fault-tolerant 
equivalents. The work here suggests that practical quantum algorithms 
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should avoid, where possible, logical gates that must be implemented using 
an approximate sequence of fault-tolerant gates. An important extension 
of this work would be to similarly examine the properties of fault-tolerant 
approximations of multiple-qubit gates and larger circuits. 



11. Concluding remarks 



Neglecting Chapters 11] and which contain review material only, in this 
thesis we 

1. (Chapter |2j performed simulations of the adiabatic Kane 3ip in 28Si 
CNOT gate suggesting that achieving a probability of error less than 
10~^ is possible provided the presence of the silicon dioxide layer, 
gate electrodes, and control circuitry do not reduce the experimentally 
measured coherence times of the ^^P donor electron and nucleus, which 
were obtained in bulk ^^Si, by more than a factor of 6. 

2. (Chapter IS} performed simulations of the adiabatic Kane 

31p 28si 

readout operation suggesting that the fidelity, stability, and accessibil- 
ity of the states required to transfer nuclear spin information onto the 
donor electrons is insufficient to permit readout. We briefly outlined 
an alternative readout scheme based on resonant fields. 

3. (Chapter^I) presented a simple 5-qubit quantum error correction (QEC) 
scheme designed for a linear nearest neighbor (LNN) architecture, 
simulating its performance when subjected to both discrete and con- 
tinuous errors. Threshold error rates, at which a QEC scheme pro- 
vides precisely no reduction in error, were obtained for both discrete 
{p = 1.6 X 10^'^) and continuous {a = 4.7 x 10^^) errors. 
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4. (Chapter EI) showed that it is possible to remove the need for measure- 
ment in the QEC scheme of Chapter [51 and, with the addition of four 
qubits, make do with slow resetting. Discrete and continuous thresh- 
old error rates were reduced to p = 3.7 x lO""^ and cj = 3.1 x 10"^ 
respectively. 

5. (Chapter IH)) presented a minimal qubit count LNN circuit implement- 
ing the quantum part of Shor's L-bit integer factoring algorithm. 
Achieved circuit depth and gate count identical to leading order in 
L to that possible when long-range interactions are available. 

6. (Chapter inj showed that Shor's algorithm can be used to factor in- 
tegers 0(4^^) bits long, provided single-qubit phase rotations of angle 
7r/2°' can be implemented. Specifically, with sufficient qubits and er- 
ror correction, we showed that a 4096 bit integer could conceivably be 
factored in a single day provided single-qubit phase rotations of angle 
7r/128 lb 7r/512 could be implemented. 

7. (Chapter llUj) presented a numerical algorithm capable of obtaining 
optimal fault-tolerant approximations of arbitrary single-qubit gates. 
Used this algorithm to assess the properties of fault-tolerant approx- 
imations of single-qubit phase rotations with the conclusion that it 
is better to use simple non-fault-tolerant circuits to implement phase 
rotations in Shor's algorithm. 

Significant further work is planned in three broad areas. Firstly, over- 
coming or coping with the lack of long-range communication in the solid- 
state. For example, teleportation has been proposed as a possible long-range 
communication technique |2UL)1 l2Ulj but the details of how this would be im- 
plemented in practice have yet to be worked out. A threshold gate error rate 
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p = 2.4 X 10^ below which an arbitrarily large quantum computation can 
be performed on an LNN architecture has been shown to exist for a simple 
discrete error model |125j . but it is highly desirable that better LNN cir- 
cuits with a higher threshold are found, and that an analysis using a more 
physical model of errors is performed. 

Secondly, many quantum algorithms have now been proposed but very 
few have been analyzed for practicality, especially when only a limited set of 
quantum gates is available. A similar analysis to that carried out for Shor's 
algorithm in this thesis could be applied to quantum algorithms dealing with 
Poincare recurrences and periodic orbits (classical dynamics) j2U2| . eigen- 
value calculation [TJ, pattern recognition ^21) Schur and Clebsch-Gordon 
transforms (technically not algorithms in their own right) ^SIi numerical in- 
tegrals and stochastic processes Jl], black box function determination j2U3j . 
Jones polynomials (knot theory) a vast array of problems in the field of 
quantum system simulation j2U4l l2U5j , a somewhat controversial algorithm 
related to the classically uncomputable halting problem jl2J, and a number 
of promising algorithms based on adiabatic quantum computation I5()j . 
Explicit techniques for the translation of adiabatic quantum algorithms into 
quantum circuits also need to be developed. 

Thirdly and finally, while we have developed a quantum compiler ca- 
pable of optimally approximating an arbitrary single-qubit gate, it would 
be extremely interesting to look at optimal approximations of larger cir- 
cuits. Exact quantum compilation has received a great deal of attention 
|2n6| 12071 1208] but in the worst-case results in circuits containing a number 
of gates that scales exponentially with the number of qubits. By contrast, 
the number of gates needed to approximate an arbitrary computation scales 
logarithmically with the required accuracy, and even if the required accuracy 
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increases exponentially with the number of qubits, this implies a polynomial 

growth of gate count. 



Appendix 



A. Simple Steane code gates 



This appendix contains the complete multiphcation tables of the largest 
possible group of fault-tolerant Steane code single logical qubit gates that 
can be implemented as products of single physical qubit rotations. These 
gates were introduced in Chapter IIUI For convenience, we list these gates 
again below. 



Go 


= I 


G12 


= S^X 


Gi 


= H 


Gi3 


= HS 


G2 


= X 


Gl4 


= HS^ 


G3 


= Z 


Gi5 


= ZXH 


G4 


= S 


G16 


= SXH 


G5 


= 5t 


Gi7 


= S^XH 


Gg 


= XH 


G18 


= HSH 


G7 


= ZH 


Gig 


= HS^H 


Gg 


= SH 


G20 


= HSX 


Gg 


= S^H 


G21 


= HS^X 


Gio 


= ZX 


G22 


= S^HS 


Gil 


= SX 


G23 


= SHS^ 



(A.l) 



The tables on the following pages show GjGj = G^ with i the vertical index 
and j the horizontal index. 
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Gi 


G2 


G3 


G4 


G5 




Gy 


Gs 


Gg 


GlQ 




Gi2 


Gi 


Go 


G7 






G*i4 


G3 


G2 




Gig 






G21 


G2 


Ge 


Go 








Gi 


Gl5 


Gij 


Gio 


G3 


G5 


G4 


G3 


G7 






G5 


G4 


Gi5 


Gi 


Gg 


Gs 


G2 


G*i2 


Gn 


Gi 


Ga 




G5 


G3 


Go 




Gg 


Gr 


Gi 


G*i2 




G2 


G5 


G9 


G\2 




Go 


G3 


Gil 


Gs 


Gi 


G7 


Gil 


G2 


Gio 


Gq 


G2 




Gi 


Gl4 


G13 


GlQ 


Go 


Gig 


Gis 


G7 


G21 


G20 


G7 


G3 


Gi 


Gi5 


G21 


G20 


Go 


Gio 


G23 


G22 


Ge 


Gl4 


Gi3 


Gs 


G4 


Gg 


G16 


Gi9 


G23 


G5 


Gn 


Gi4 


G21 


Gi7 


Gis 


G22 


G9 


G5 


Gs 


Gi7 


G22 


G18 


G4 


G12 


G20 


Gi3 


G16 


G23 


Gig 


Gio 


Gl5 


G3 


G2 


Gil 


G12 


G7 


Ge 


G16 


Gi7 


Go 


G4 


G5 


Gil 


G16 


G4 


G12 


G2 


Gio 


Gs 


Gil 


Ge 


Gi5 


G5 


Go 


G3 


G12 


Gl7 


G5 


Gn 


Gio 


G2 


G9 


G\Q 


Gi5 


Ge 


G4 


Gs 


Go 





Gi3 


Gi4 


Gi5 


Gie 
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Gis 
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G22 
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Gio 


G22 
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Gig 


Gis 


G21 


G20 


G23 


G22 


G3 


G21 


G20 


Ge 


Gi7 


G16 


G23 


G22 


Gl4 


Gi3 


Gig 


G18 


G4 


Gig 


G23 


Gi7 


Gi5 


Ge 


Gi4 


G21 


Gis 


G22 


Gi3 


G20 


G5 


G22 


G18 


Gie 


Ge 


Gi5 


G20 


Gi3 


G23 


Gig 


G21 


Gi4 


Ge 


G12 


Gn 


G3 


G23 


G22 


Gi7 


Gie 


G5 


G4 


Gg 


Gs 


G7 


G5 


G4 


G2 


Gig 


Gis 


Gg 


Gs 


G12 


Gn 


Gi7 
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G2 


Gi5 
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G7 


Gi 
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G2 
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G7 
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G21 
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Gi 


Gg 
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G18 
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G, 
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Gs 
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G2 


Go 


Gi 
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G7 
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G4 


G3 
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Gi 


Gg 


Go 


G2 


G23 
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G7 
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G5 
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G3 


Ge 


Gi 


G2 
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